Are you able to set up the Drill daemons (drillbit) on HDFS as opposed to having them remotely? In addition to the latency issues which would likely be addressed, Drill will optimize for data locality so that the amount of network traffic is minimized (each drillbit processes the data local to that node).
On Wed, Sep 9, 2015 at 10:22 PM, Liu, Wen <[email protected]> wrote: > Got the root cause, these 2 tables are on hdfs, there are some network > stability issues between the drill server and namenode these days. > Thanks guys for your support, really appreciate it. > > Thanks, > Wen > > On 9/10/15, 11:04 AM, "Jinfeng Ni" <[email protected]> wrote: > > >FirstStart/LastStart indicates the query planning time. That > >indicates planner takes ~53 seconds to get the plan. > > > >Is rpp_event_group or rpp_event a view or base table? The query > >you posted just contain a two table join. Seems 53s is too long > >to plan for such a query, unless either 'rpp_event_group' or 'rpp_event' > >is a view which contains join as well. > > > > > > > >On Wed, Sep 9, 2015 at 7:38 PM, Liu, Wen <[email protected]> wrote: > > > >> Here it is, > >> select a.url_txt as url, a.rpp_event_grp_id as grpid from > >> dfs.tdhdp.`rpp_event_group` as a, dfs.tdhdp.`rpp_event` as b where > >> a.site_id = 0 and b.active_ind = 1 and a.rpp_event_grp_id = > >> b.rpp_event_grp_id group by a.url_txt, a.rpp_event_grp_id > >> > >> > >> Below is the tdhdp definition: > >> "tdhdp": { > >> "location": "/user/tdhdp/latest", > >> "writable": true, > >> "defaultInputFormat": "parquet" > >> } > >> > >> Let me know if anything else needed. > >> > >> Thanks, > >> Wen > >> > >> > >> On 9/10/15, 10:31 AM, "Hsuan Yi Chu" <[email protected]> wrote: > >> > >> >Can you share your query ? What does it look like ? > >> > > >> >On Wed, Sep 9, 2015 at 7:24 PM, Liu, Wen <[email protected]> wrote: > >> > > >> >> Hi there, > >> >> I am running drillbit in localhost, and the zookeeper is also on the > >> >>same > >> >> host. I sent the same query multiple times, most of them response > >>very > >> >> quickly(<500ms). But a few are slowly. On query profile page, I saw > >>the > >> >> slow queries have a long time for "First Start" column, see below > >> >>sample. > >> >> Any hints what's the possible reason? > >> >> > >> >> I tried both 0.8.0 and 1.1.0, and got the same result. > >> >> > >> >> Major Fragment Minor Fragments Reporting First Start Last > >> >> Start First End Last End tmin tavg tmax > >> >>memmax > >> >> 00-xx-xx 1 / 1 53.668 53.668 53.877 53.877 0.209 0.209 > >> >> 0.209 27MB > >> >> > >> >> Thanks, > >> >> Wen > >> >> > >> > >> > > -- Tomer Shiran CEO and Co-Founder, Dremio
