Looks like this definitely is the following bug: https://issues.apache.org/jira/browse/DRILL-2512
It's a pretty severe performance bottleneck having the foreman doing so much work. In our environment, the foreman hits basically 95-100% CPU while the other drillbits barely do much work. Means it's nearly impossible for us to scale out. On Wed, Apr 8, 2015 at 3:58 PM, Adam Gilmore <[email protected]> wrote: > Anyone have any more thoughts on this? Anywhere I can start trying to > troubleshoot? > > On Thu, Mar 26, 2015 at 4:13 PM, Adam Gilmore <[email protected]> > wrote: > >> So there are 5 Parquet files, each ~125mb - not sure what I can provide >> re the block locations? I believe it's under the HDFS block size so they >> should be stored contiguously. >> >> I've tried setting the affinity factor to various values (1, 0, etc.) but >> nothing seems to change that. It always prefers certain nodes. >> >> Moreover, we added a stack more nodes and it started picking very >> specific nodes as foremen (perhaps 2-3 nodes out of 20 were always picked >> as foremen). Therefore, the foremen were being swamped with CPU while the >> other nodes were doing very little work. >> >> On Thu, Mar 26, 2015 at 12:12 PM, Steven Phillips <[email protected] >> > wrote: >> >>> Actually, I believe a query submitted through REST interface will >>> instantiate a DrillClient, which uses the same ZKClusterCoordinator that >>> sqlline uses, and thus the foreman for the query is not necessarily on >>> the >>> same drillbit as it was submitted to. But I'm still not sure it's related >>> to DRILL-2512. >>> >>> I'll wait for your additional info before speculating further. >>> >>> On Wed, Mar 25, 2015 at 6:54 PM, Adam Gilmore <[email protected]> >>> wrote: >>> >>> > We actually setup a separate load balancer for port 8047 (we're >>> submitting >>> > these queries via the REST API at the moment) so Zookeeper etc. is out >>> of >>> > the equation, thus I doubt we're hitting DRILL-2512. >>> > >>> > When shutitng down the "troublesome" drillbit, it starts parallelizing >>> much >>> > nicer again. We even added 10+ nodes to the cluster and as long as >>> that >>> > particular drillbit is shut down, it distributes very nicely. The >>> minute >>> > we start the drillbit on that node again, it starts swamping it with >>> work. >>> > >>> > I'll shoot through the JSON profiles and some more information on the >>> > dataset etc. later today (Australian time!). >>> > >>> > On Thu, Mar 26, 2015 at 5:31 AM, Steven Phillips < >>> [email protected]> >>> > wrote: >>> > >>> > > I didn't notice at first that Adam said "no matter who the foreman >>> is". >>> > > >>> > > Another suspicion I have is that our current logic for assigning work >>> > will >>> > > assign to the exact same nodes every time we query a particular >>> table. >>> > > Changing affinity factor may change it, but it will still be the same >>> > every >>> > > time. That is my suspicion, but I am not sure why shutting down the >>> > > drillbit would improve performance. I would expect that shutting >>> down the >>> > > drillbit would result in a different drillbit becoming the hotspot. >>> > > >>> > > On Wed, Mar 25, 2015 at 12:16 PM, Jacques Nadeau <[email protected] >>> > >>> > > wrote: >>> > > >>> > > > On Steven's point, the node that the client connects to is not >>> > currently >>> > > > randomized. Given your description of behavior, I'm not sure that >>> > you're >>> > > > hitting 2512 or just general undesirable distribution. >>> > > > >>> > > > On Wed, Mar 25, 2015 at 10:18 AM, Steven Phillips < >>> > > [email protected]> >>> > > > wrote: >>> > > > >>> > > > > This is a known issue: >>> > > > > >>> > > > > https://issues.apache.org/jira/browse/DRILL-2512 >>> > > > > >>> > > > > On Wed, Mar 25, 2015 at 8:13 AM, Andries Engelbrecht < >>> > > > > [email protected]> wrote: >>> > > > > >>> > > > > > What version of Drill are you running? >>> > > > > > >>> > > > > > Any hints when looking at the query profiles? Is the node that >>> is >>> > > being >>> > > > > > hammered the foreman for the queries and most of the major >>> > fragments >>> > > > are >>> > > > > > tied to the foreman? >>> > > > > > >>> > > > > > —Andries >>> > > > > > >>> > > > > > >>> > > > > > On Mar 25, 2015, at 12:00 AM, Adam Gilmore < >>> [email protected]> >>> > > > > wrote: >>> > > > > > >>> > > > > > > Hi guys, >>> > > > > > > >>> > > > > > > I'm trying to understand how this could be possible. I have >>> a >>> > > Hadoop >>> > > > > > > cluster of a name node and two data nodes setup. All have >>> > > identical >>> > > > > > specs >>> > > > > > > in terms of CPU/RAM etc. >>> > > > > > > >>> > > > > > > The two data nodes have a replicated HDFS setup where I'm >>> storing >>> > > > some >>> > > > > > > Parquet files. >>> > > > > > > >>> > > > > > > A Drill cluster (with Zookeeper) is running with Drillbits >>> on all >>> > > > three >>> > > > > > > servers. >>> > > > > > > >>> > > > > > > When I submit a query to *any* of the Drillbits, no matter >>> who >>> > the >>> > > > > > foreman >>> > > > > > > is, one particular data node gets picked to do the vast >>> majority >>> > of >>> > > > the >>> > > > > > > work. >>> > > > > > > >>> > > > > > > We've even added three more task nodes to the cluster and >>> > > everything >>> > > > > > still >>> > > > > > > puts a huge load on one particular server. >>> > > > > > > >>> > > > > > > There is nothing unique about this data node. HDFS is fully >>> > > > replicated >>> > > > > > (no >>> > > > > > > unreplicated blocks) to the other data node. >>> > > > > > > >>> > > > > > > I know that Drill tries to get data locality, so I'm >>> wondering if >>> > > > this >>> > > > > is >>> > > > > > > the cause, but this essentially swamping this data node with >>> 100% >>> > > CPU >>> > > > > > usage >>> > > > > > > while leaving the others barely doing any work. >>> > > > > > > >>> > > > > > > As soon as we shut down the Drillbit on this data node, query >>> > > > > performance >>> > > > > > > increases significantly. >>> > > > > > > >>> > > > > > > Any thoughts on how I can troubleshoot why Drill is picking >>> that >>> > > > > > particular >>> > > > > > > node? >>> > > > > > >>> > > > > > >>> > > > > >>> > > > > >>> > > > > -- >>> > > > > Steven Phillips >>> > > > > Software Engineer >>> > > > > >>> > > > > mapr.com >>> > > > > >>> > > > >>> > > >>> > > >>> > > >>> > > -- >>> > > Steven Phillips >>> > > Software Engineer >>> > > >>> > > mapr.com >>> > > >>> > >>> >>> >>> >>> -- >>> Steven Phillips >>> Software Engineer >>> >>> mapr.com >>> >> >> >
