so yes, you are correct, you should set it to 1 x 32 x 0.7

Btw, Drill should already have set this option to 32 x 0.7

On Mon, Feb 15, 2016 at 11:36 AM, Abdel Hakim Deneche <[email protected]
> wrote:

> Don't be, it took me quite some time to figure out this one either =P
>
> the "number of active drillbits" refers to the number of Drillbits running
> on each node of the cluster. Generally, you have 1 active Drillbit per node.
>
> On Mon, Feb 15, 2016 at 11:22 AM, John Omernik <[email protected]> wrote:
>
>> I am really sorry for being dense here, but based on your comment then,
>> and
>> the docs then if you had sixteen 32 core machines, but only one drill bit
>> running per node, you'd still use 1 (one drill bit per node) * 32 (the
>> number of cores) * 0.7 (the modifier in the docs) to get 23 as the number
>> to set for planner.width_max_per_node  Not 16 * 32 * 0.7.  A reading of
>> the
>> docs is confusing (see below) you can read that as number of active drill
>> bits, which on a sixteen node cluster, one per node would be 16 * 32
>> (cores
>> per node) * 0.7.  But I think you are saying that we should be taking 1
>> drill bit per node * 32 * 0.7 ... correct?
>>
>> Quote from the docs:
>> number of active drillbits (typically one per node) * number of cores per
>> node * 0.7
>>
>> On Mon, Feb 15, 2016 at 11:15 AM, Abdel Hakim Deneche <
>> [email protected]
>> > wrote:
>>
>> > No, it's the maximum number of threads each drillbit will be able to
>> spawn
>> > for every major fragment of a query.
>> >
>> > If you run a query on a cluster of 32 core machines, and the query plan
>> > contains multiple major fragments, each major fragment will have "at
>> most"
>> > 32 x 0.7= 23 minor fragments (or threads) running in parallel on every
>> > drillbit. The "at most" is important here, as other factors limit how
>> many
>> > minor fragments can run in parallel, for example nature and size of the
>> > data.
>> >
>> > On Mon, Feb 15, 2016 at 7:41 AM, John Omernik <[email protected]> wrote:
>> >
>> > > *
>> > >
>> >
>> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing
>> > > <
>> > >
>> >
>> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing
>> > > >*
>> > >
>> > >
>> > > *On this page, on the setting planner.width.max_per_node it says the
>> > > below.  In the equation, of number of active drillbits * number of
>> cores
>> > > per node * 0.7,  is the number of active drillbits the number of drill
>> > bits
>> > > PER NODE (as this setting is per node) or is that the number of active
>> > > drill bits per cluster?  The example is unclear because it only shows
>> an
>> > > example on a single node cluster.  (Typically 1 per node doesn't
>> clarify
>> > > whether that number should be per node or per drill bit)*
>> > >
>> > > *Thanks!*
>> > >
>> > >
>> > >
>> > > The maximum width per node defines the maximum degree of parallelism
>> for
>> > > any fragment of a query, but the setting applies at the level of a
>> single
>> > > node in the cluster. The *default* maximum degree of parallelism per
>> node
>> > > is calculated as follows, with the theoretical maximum automatically
>> > scaled
>> > > back (and rounded down) so that only 70% of the actual available
>> capacity
>> > > is taken into account: number of active drillbits (typically one per
>> > node)
>> > > * number of cores per node * 0.7
>> > >
>> > > For example, on a single-node test system with 2 cores and
>> > hyper-threading
>> > >
>> > > enabled: 1 * 4 * 0.7 = 3
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Abdelhakim Deneche
>> >
>> > Software Engineer
>> >
>> >   <http://www.mapr.com/>
>> >
>> >
>> > Now Available - Free Hadoop On-Demand Training
>> > <
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >
>> >
>>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to