Oh I realize it's more involved now, I guess I am advocating making it simpler, and trying to determine if that is something folks with better dev skills than me would be willing to undertake. Basically I am asking the question, is there advantage to this, especially given use with Yarn and/or Mesos. to be more limited so Drill plays nice in these global resource clusters?
John On Tue, Aug 4, 2015 at 3:24 PM, Andries Engelbrecht < [email protected]> wrote: > It is a bit more involved that just setting the one parameter. > > See the link Jacques posted earlier for a better explanation. > https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/ > < > https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/ > > > > —Andries > > > On Aug 4, 2015, at 12:01 PM, John Omernik <[email protected]> wrote: > > > > Does the planner.width.max_per_node basically set the max CPU cores it > can > > use? so lets say I have a node with 20 physical cores (40 vcores), and I > > want my drill bit to use 20 of them, is it as simple as > > planner.width.max_per_node=20? I guess I am trying to figure out a way > to > > basically tell the bit for all queries, that 20 is the max it can use > > because that's where I am going line things up with mesos. Additionally, > I > > think setting that at a "query" level is not good, because I could have a > > homogeneous cluster, and a system wide value of 20 would work, but what I > > have some drill bits that are set to be 10 cores, and another set to be > 20 > > because of the difference in sizes. That's where having a "bit level" > > limitation on maximum cpu resources a bit can take could be advantageous, > > especially considering a frame work that may be able to spin up and spin > > down nodes based on cluster resource management. > > > > On Tue, Aug 4, 2015 at 11:54 AM, Andries Engelbrecht < > > [email protected]> wrote: > > > >> It is probably best to control thins more carefully when using more > >> specialized environments such as Mesos, than relying on default install > >> options. > >> Since the CPU/execution threads in Drill is dynamic you are probably > >> better of just using > >> alter system set `planner.width.max_per_node` = <thread count> > >> to control the CPU utilization. > >> > >> Do keep in mind the suggestions by Jacques to take concurrency into > >> account, etc when using the queue and width parameters. > >> > >> For scripting you can also use sqlline —run=<path/to/script file> to > >> change the drill config for dynamic options on the fly. > >> > >> Have not tried multiple small drillbits, but will likely not be optimal > >> for resource optimization and management/configuration will be more > >> challenging. > >> > >> —Andries > >> > >> > >> > >>> On Aug 4, 2015, at 9:30 AM, Timothy Chen <[email protected]> wrote: > >>> > >>> Hi John, > >>> > >>> I think Drill will not detect the number of cpus that it was limited > >>> to by Mesos, since Mesos uses cgroup limits and doesn't really limit > >>> the number of processors that it can run on. > >>> > >>> And yes I think a custom per node drill bit setting is required, which > >>> is a perfect motivation to have a Drill Mesos Framework that can > >>> automatically set these configuration for you. > >>> > >>> Tim > >>> > >>> > >>> > >>> On Tue, Aug 4, 2015 at 8:23 AM, John Omernik <[email protected]> wrote: > >>>> This is interesting, but also leads to more questions. :) *I hope you > >> don't > >>>> mind. > >>>> > >>>> If I execute Drill using cgroups isolation with Marathon/Mesos, and > >> tell a > >>>> certain bit to use 4 CPU shares on a 8 CPU node, Is drill going to be > >> aware > >>>> that it's limited to 4 CPUS and plan accordingly, or will use some > sort > >> of > >>>> system call to determine the number of cores, not the number of > >>>> cores/shares it has access to? I could see that being an issue in the > >>>> default calculation. > >>>> > >>>> So that leads me to the next question, if I am running Drill in a > shared > >>>> environment like this, to actually work with this, I have to do a > custom > >>>> per_node sitting per drill bit and have that line up with my cgroup > >>>> resource allocation with Marathon Mesos... correct? > >>>> > >>>> Is there any plans to making this more of a hard env variable that can > >> be > >>>> passed to the drill bit on start up? This seems to make the > >> coordination a > >>>> lot easier. Any other options that may make sense? > >>>> > >>>> That leads me to another question? Is it better to have one big drill > >> bit > >>>> per node for multiple users to work with, or smaller, say per > department > >>>> drill bits (but multiple of them) per node. Just looking for > planning > >>>> purposes. > >>>> > >>>> Thanks for you help !! > >>>> > >>>> John > >>>> > >>>> On Tue, Aug 4, 2015 at 9:18 AM, Jacques Nadeau <[email protected]> > >> wrote: > >>>> > >>>>> Internally, there are also some soft capabilities. These include > using > >>>>> planner.max.width.per.node and queues: > >>>>> > >>>>> > >> > https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/ > >>>>> > >>>>> -- > >>>>> Jacques Nadeau > >>>>> CTO and Co-Founder, Dremio > >>>>> > >>>>> On Tue, Aug 4, 2015 at 6:38 AM, John Omernik <[email protected]> > wrote: > >>>>> > >>>>>> I am looking to work with drill in a managed cluster (having it play > >> nice > >>>>>> with Mesos). While I can limit the ram in the drill-env.sh, the CPU > >> is > >>>>> not > >>>>>> limitable, therefore, drill can just grab all the CPU resources it > >> wants. > >>>>>> Is there any plans to include some self limiting to Drill on CPU > >>>>> resources? > >>>>>> In the docs it says use CGroups, which I need to read up on, but > >>>>> frameworks > >>>>>> like Spark and Impala allow you to set the CPU resources in the > >>>>> framework. > >>>>>> Is CGroups going to get me similar behavior to those? Are there > >>>>>> disadvantages to setting these resources in drill itself? > >>>>>> > >>>>>> Thanks > >>>>>> > >>>>>> John > >>>>>> > >>>>> > >> > >> > >
