I am really sorry for being dense here, but based on your comment then, and the docs then if you had sixteen 32 core machines, but only one drill bit running per node, you'd still use 1 (one drill bit per node) * 32 (the number of cores) * 0.7 (the modifier in the docs) to get 23 as the number to set for planner.width_max_per_node Not 16 * 32 * 0.7. A reading of the docs is confusing (see below) you can read that as number of active drill bits, which on a sixteen node cluster, one per node would be 16 * 32 (cores per node) * 0.7. But I think you are saying that we should be taking 1 drill bit per node * 32 * 0.7 ... correct?
Quote from the docs: number of active drillbits (typically one per node) * number of cores per node * 0.7 On Mon, Feb 15, 2016 at 11:15 AM, Abdel Hakim Deneche <[email protected] > wrote: > No, it's the maximum number of threads each drillbit will be able to spawn > for every major fragment of a query. > > If you run a query on a cluster of 32 core machines, and the query plan > contains multiple major fragments, each major fragment will have "at most" > 32 x 0.7= 23 minor fragments (or threads) running in parallel on every > drillbit. The "at most" is important here, as other factors limit how many > minor fragments can run in parallel, for example nature and size of the > data. > > On Mon, Feb 15, 2016 at 7:41 AM, John Omernik <[email protected]> wrote: > > > * > > > https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing > > < > > > https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/#configuring-query-queuing > > >* > > > > > > *On this page, on the setting planner.width.max_per_node it says the > > below. In the equation, of number of active drillbits * number of cores > > per node * 0.7, is the number of active drillbits the number of drill > bits > > PER NODE (as this setting is per node) or is that the number of active > > drill bits per cluster? The example is unclear because it only shows an > > example on a single node cluster. (Typically 1 per node doesn't clarify > > whether that number should be per node or per drill bit)* > > > > *Thanks!* > > > > > > > > The maximum width per node defines the maximum degree of parallelism for > > any fragment of a query, but the setting applies at the level of a single > > node in the cluster. The *default* maximum degree of parallelism per node > > is calculated as follows, with the theoretical maximum automatically > scaled > > back (and rounded down) so that only 70% of the actual available capacity > > is taken into account: number of active drillbits (typically one per > node) > > * number of cores per node * 0.7 > > > > For example, on a single-node test system with 2 cores and > hyper-threading > > > > enabled: 1 * 4 * 0.7 = 3 > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
