Re: Drill favouring a particular Drillbit

Steven Phillips Wed, 25 Mar 2015 10:19:33 -0700

This is a known issue:

https://issues.apache.org/jira/browse/DRILL-2512


On Wed, Mar 25, 2015 at 8:13 AM, Andries Engelbrecht <
[email protected]> wrote:

> What version of Drill are you running?
>
> Any hints when looking at the query profiles? Is the node that is being
> hammered the foreman for the queries and most of the major fragments are
> tied to the foreman?
>
> —Andries
>
>
> On Mar 25, 2015, at 12:00 AM, Adam Gilmore <[email protected]> wrote:
>
> > Hi guys,
> >
> > I'm trying to understand how this could be possible.  I have a Hadoop
> > cluster of a name node and two data nodes setup.  All have identical
> specs
> > in terms of CPU/RAM etc.
> >
> > The two data nodes have a replicated HDFS setup where I'm storing some
> > Parquet files.
> >
> > A Drill cluster (with Zookeeper) is running with Drillbits on all three
> > servers.
> >
> > When I submit a query to *any* of the Drillbits, no matter who the
> foreman
> > is, one particular data node gets picked to do the vast majority of the
> > work.
> >
> > We've even added three more task nodes to the cluster and everything
> still
> > puts a huge load on one particular server.
> >
> > There is nothing unique about this data node.  HDFS is fully replicated
> (no
> > unreplicated blocks) to the other data node.
> >
> > I know that Drill tries to get data locality, so I'm wondering if this is
> > the cause, but this essentially swamping this data node with 100% CPU
> usage
> > while leaving the others barely doing any work.
> >
> > As soon as we shut down the Drillbit on this data node, query performance
> > increases significantly.
> >
> > Any thoughts on how I can troubleshoot why Drill is picking that
> particular
> > node?
>
>


-- 
 Steven Phillips
 Software Engineer

 mapr.com

Re: Drill favouring a particular Drillbit

Reply via email to