Don't be, it took me quite some time to figure out this one either =P

the "number of active drillbits" refers to the number of Drillbits running
on each node of the cluster. Generally, you have 1 active Drillbit per node.

On Mon, Feb 15, 2016 at 11:22 AM, John Omernik <[email protected]> wrote:

> I am really sorry for being dense here, but based on your comment then, and
> the docs then if you had sixteen 32 core machines, but only one drill bit
> running per node, you'd still use 1 (one drill bit per node) * 32 (the
> number of cores) * 0.7 (the modifier in the docs) to get 23 as the number
> to set for planner.width_max_per_node  Not 16 * 32 * 0.7.  A reading of the
> docs is confusing (see below) you can read that as number of active drill
> bits, which on a sixteen node cluster, one per node would be 16 * 32 (cores
> per node) * 0.7.  But I think you are saying that we should be taking 1
> drill bit per node * 32 * 0.7 ... correct?
>
> Quote from the docs:
> number of active drillbits (typically one per node) * number of cores per
> node * 0.7
>
> On Mon, Feb 15, 2016 at 11:15 AM, Abdel Hakim Deneche <
> [email protected]
> > wrote:
>
> > No, it's the maximum number of threads each drillbit will be able to
> spawn
> > for every major fragment of a query.
> >
> > If you run a query on a cluster of 32 core machines, and the query plan
> > contains multiple major fragments, each major fragment will have "at
> most"
> > 32 x 0.7= 23 minor fragments (or threads) running in parallel on every
> > drillbit. The "at most" is important here, as other factors limit how
> many
> > minor fragments can run in parallel, for example nature and size of the
> > data.
> >
> > On Mon, Feb 15, 2016 at 7:41 AM, John Omernik <[email protected]> wrote:
> >
> > > *
> > >
> >
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing
> > > <
> > >
> >
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing
> > > >*
> > >
> > >
> > > *On this page, on the setting planner.width.max_per_node it says the
> > > below.  In the equation, of number of active drillbits * number of
> cores
> > > per node * 0.7,  is the number of active drillbits the number of drill
> > bits
> > > PER NODE (as this setting is per node) or is that the number of active
> > > drill bits per cluster?  The example is unclear because it only shows
> an
> > > example on a single node cluster.  (Typically 1 per node doesn't
> clarify
> > > whether that number should be per node or per drill bit)*
> > >
> > > *Thanks!*
> > >
> > >
> > >
> > > The maximum width per node defines the maximum degree of parallelism
> for
> > > any fragment of a query, but the setting applies at the level of a
> single
> > > node in the cluster. The *default* maximum degree of parallelism per
> node
> > > is calculated as follows, with the theoretical maximum automatically
> > scaled
> > > back (and rounded down) so that only 70% of the actual available
> capacity
> > > is taken into account: number of active drillbits (typically one per
> > node)
> > > * number of cores per node * 0.7
> > >
> > > For example, on a single-node test system with 2 cores and
> > hyper-threading
> > >
> > > enabled: 1 * 4 * 0.7 = 3
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to