I see, will try this approach. On 9/10/15, 1:26 PM, "Tomer Shiran" <[email protected]> wrote:
>Are you able to set up the Drill daemons (drillbit) on HDFS as opposed to >having them remotely? In addition to the latency issues which would likely >be addressed, Drill will optimize for data locality so that the amount of >network traffic is minimized (each drillbit processes the data local to >that node). > >On Wed, Sep 9, 2015 at 10:22 PM, Liu, Wen <[email protected]> wrote: > >> Got the root cause, these 2 tables are on hdfs, there are some network >> stability issues between the drill server and namenode these days. >> Thanks guys for your support, really appreciate it. >> >> Thanks, >> Wen >> >> On 9/10/15, 11:04 AM, "Jinfeng Ni" <[email protected]> wrote: >> >> >FirstStart/LastStart indicates the query planning time. That >> >indicates planner takes ~53 seconds to get the plan. >> > >> >Is rpp_event_group or rpp_event a view or base table? The query >> >you posted just contain a two table join. Seems 53s is too long >> >to plan for such a query, unless either 'rpp_event_group' or >>'rpp_event' >> >is a view which contains join as well. >> > >> > >> > >> >On Wed, Sep 9, 2015 at 7:38 PM, Liu, Wen <[email protected]> wrote: >> > >> >> Here it is, >> >> select a.url_txt as url, a.rpp_event_grp_id as grpid from >> >> dfs.tdhdp.`rpp_event_group` as a, dfs.tdhdp.`rpp_event` as b where >> >> a.site_id = 0 and b.active_ind = 1 and a.rpp_event_grp_id = >> >> b.rpp_event_grp_id group by a.url_txt, a.rpp_event_grp_id >> >> >> >> >> >> Below is the tdhdp definition: >> >> "tdhdp": { >> >> "location": "/user/tdhdp/latest", >> >> "writable": true, >> >> "defaultInputFormat": "parquet" >> >> } >> >> >> >> Let me know if anything else needed. >> >> >> >> Thanks, >> >> Wen >> >> >> >> >> >> On 9/10/15, 10:31 AM, "Hsuan Yi Chu" <[email protected]> wrote: >> >> >> >> >Can you share your query ? What does it look like ? >> >> > >> >> >On Wed, Sep 9, 2015 at 7:24 PM, Liu, Wen <[email protected]> wrote: >> >> > >> >> >> Hi there, >> >> >> I am running drillbit in localhost, and the zookeeper is also on >>the >> >> >>same >> >> >> host. I sent the same query multiple times, most of them response >> >>very >> >> >> quickly(<500ms). But a few are slowly. On query profile page, I >>saw >> >>the >> >> >> slow queries have a long time for "First Start" column, see below >> >> >>sample. >> >> >> Any hints what's the possible reason? >> >> >> >> >> >> I tried both 0.8.0 and 1.1.0, and got the same result. >> >> >> >> >> >> Major Fragment Minor Fragments Reporting First Start >>Last >> >> >> Start First End Last End tmin tavg tmax >> >> >>memmax >> >> >> 00-xx-xx 1 / 1 53.668 53.668 53.877 53.877 0.209 >>0.209 >> >> >> 0.209 27MB >> >> >> >> >> >> Thanks, >> >> >> Wen >> >> >> >> >> >> >> >> >> > > >-- >Tomer Shiran >CEO and Co-Founder, Dremio
