Are you able to set up the Drill daemons (drillbit) on HDFS as opposed to
having them remotely? In addition to the latency issues which would likely
be addressed, Drill will optimize for data locality so that the amount of
network traffic is minimized (each drillbit processes the data local to
that node).

On Wed, Sep 9, 2015 at 10:22 PM, Liu, Wen <[email protected]> wrote:

> Got the root cause, these 2 tables are on hdfs, there are some network
> stability issues between the drill server and namenode these days.
> Thanks guys for your support, really appreciate it.
>
> Thanks,
> Wen
>
> On 9/10/15, 11:04 AM, "Jinfeng Ni" <[email protected]> wrote:
>
> >FirstStart/LastStart indicates the query planning time. That
> >indicates planner takes ~53 seconds to get the plan.
> >
> >Is rpp_event_group or rpp_event a view or base table? The query
> >you posted just contain a two table join. Seems 53s is too long
> >to plan for such a query, unless either 'rpp_event_group' or 'rpp_event'
> >is a view which contains join as well.
> >
> >
> >
> >On Wed, Sep 9, 2015 at 7:38 PM, Liu, Wen <[email protected]> wrote:
> >
> >> Here it is,
> >> select a.url_txt as url, a.rpp_event_grp_id as grpid from
> >> dfs.tdhdp.`rpp_event_group` as a, dfs.tdhdp.`rpp_event` as b where
> >> a.site_id = 0 and b.active_ind = 1 and a.rpp_event_grp_id =
> >> b.rpp_event_grp_id group by a.url_txt, a.rpp_event_grp_id
> >>
> >>
> >> Below is the tdhdp definition:
> >> "tdhdp": {
> >>       "location": "/user/tdhdp/latest",
> >>       "writable": true,
> >>       "defaultInputFormat": "parquet"
> >>     }
> >>
> >> Let me know if anything else needed.
> >>
> >> Thanks,
> >> Wen
> >>
> >>
> >> On 9/10/15, 10:31 AM, "Hsuan Yi Chu" <[email protected]> wrote:
> >>
> >> >Can you share your query ? What does it look like ?
> >> >
> >> >On Wed, Sep 9, 2015 at 7:24 PM, Liu, Wen <[email protected]> wrote:
> >> >
> >> >> Hi there,
> >> >> I am running drillbit in localhost, and the zookeeper is also on the
> >> >>same
> >> >> host. I sent the same query multiple times, most of them response
> >>very
> >> >> quickly(<500ms). But a few are slowly. On query profile page, I saw
> >>the
> >> >> slow queries have a long time for "First Start" column, see below
> >> >>sample.
> >> >> Any hints what's the possible reason?
> >> >>
> >> >> I tried both 0.8.0 and 1.1.0, and got the same result.
> >> >>
> >> >> Major Fragment  Minor Fragments Reporting       First Start     Last
> >> >> Start      First End       Last End        tmin    tavg    tmax
> >> >>memmax
> >> >> 00-xx-xx        1 / 1   53.668  53.668  53.877  53.877  0.209   0.209
> >> >>  0.209   27MB
> >> >>
> >> >> Thanks,
> >> >> Wen
> >> >>
> >>
> >>
>
>


-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Reply via email to