Re: CPU Resource Management

John Omernik Tue, 04 Aug 2015 14:26:57 -0700

Oh I realize it's more involved now, I guess I am advocating making it
simpler, and trying to determine if that is something folks with better dev
skills than me would be willing to undertake. Basically I am asking the
question, is there advantage to this, especially given use with Yarn and/or
Mesos. to be more limited so Drill plays nice in these global resource
clusters?


John

On Tue, Aug 4, 2015 at 3:24 PM, Andries Engelbrecht <
[email protected]> wrote:

> It is a bit more involved that just setting the one parameter.
>
> See the link Jacques posted earlier for a better explanation.
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
> <
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
> >
>
> —Andries
>
> > On Aug 4, 2015, at 12:01 PM, John Omernik <[email protected]> wrote:
> >
> > Does the planner.width.max_per_node basically set the max CPU cores it
> can
> > use? so lets say I have a node with 20 physical cores (40 vcores), and I
> > want my drill bit to use 20 of them, is it as simple as
> > planner.width.max_per_node=20?   I guess I am trying to figure out a way
> to
> > basically tell the  bit for all queries, that 20 is the max it can use
> > because that's where I am going line things up with mesos. Additionally,
> I
> > think setting that at a "query" level is not good, because I could have a
> > homogeneous cluster, and a system wide value of 20 would work, but what I
> > have some drill bits that are set to be 10 cores, and another set to be
> 20
> > because of the difference in sizes.  That's where having a "bit level"
> > limitation on maximum cpu resources a bit can take could be advantageous,
> > especially considering a frame work that may be able to spin up and spin
> > down nodes based on cluster resource management.
> >
> > On Tue, Aug 4, 2015 at 11:54 AM, Andries Engelbrecht <
> > [email protected]> wrote:
> >
> >> It is probably best to control thins more carefully when using more
> >> specialized environments such as Mesos, than relying on default install
> >> options.
> >> Since the CPU/execution threads in Drill is dynamic you are probably
> >> better of just using
> >> alter system set `planner.width.max_per_node` = <thread count>
> >> to control the CPU utilization.
> >>
> >> Do keep in mind the suggestions by Jacques to take concurrency into
> >> account, etc when using the queue and width parameters.
> >>
> >> For scripting you can also use sqlline —run=<path/to/script file>  to
> >> change the drill config for dynamic options on the fly.
> >>
> >> Have not tried multiple small drillbits, but will likely not be optimal
> >> for resource optimization and management/configuration will be more
> >> challenging.
> >>
> >> —Andries
> >>
> >>
> >>
> >>> On Aug 4, 2015, at 9:30 AM, Timothy Chen <[email protected]> wrote:
> >>>
> >>> Hi John,
> >>>
> >>> I think Drill will not detect the number of cpus that it was limited
> >>> to by Mesos, since Mesos uses cgroup limits and doesn't really limit
> >>> the number of processors that it can run on.
> >>>
> >>> And yes I think a custom per node drill bit setting is required, which
> >>> is a perfect motivation to have a Drill Mesos Framework that can
> >>> automatically set these configuration for you.
> >>>
> >>> Tim
> >>>
> >>>
> >>>
> >>> On Tue, Aug 4, 2015 at 8:23 AM, John Omernik <[email protected]> wrote:
> >>>> This is interesting, but also leads to more questions. :) *I hope you
> >> don't
> >>>> mind.
> >>>>
> >>>> If I execute Drill using cgroups isolation with Marathon/Mesos, and
> >> tell a
> >>>> certain bit to use 4 CPU shares on a 8 CPU node, Is drill going to be
> >> aware
> >>>> that it's limited to 4 CPUS and plan accordingly, or will use some
> sort
> >> of
> >>>> system call to determine the number of cores, not the number of
> >>>> cores/shares it has access to?  I could see that being an issue in the
> >>>> default calculation.
> >>>>
> >>>> So that leads me to the next question, if I am running Drill in a
> shared
> >>>> environment like this, to actually work with this, I have to do a
> custom
> >>>> per_node sitting per drill bit and have that line up with my cgroup
> >>>> resource allocation with Marathon Mesos... correct?
> >>>>
> >>>> Is there any plans to making this more of a hard env variable that can
> >> be
> >>>> passed to the drill bit on start up?  This seems to make the
> >> coordination a
> >>>> lot easier.  Any other options that may make sense?
> >>>>
> >>>> That leads me to another question?  Is it better to have one big drill
> >> bit
> >>>> per node for multiple users to work with, or smaller, say per
> department
> >>>> drill bits (but multiple of them) per node.   Just looking for
> planning
> >>>> purposes.
> >>>>
> >>>> Thanks for you help !!
> >>>>
> >>>> John
> >>>>
> >>>> On Tue, Aug 4, 2015 at 9:18 AM, Jacques Nadeau <[email protected]>
> >> wrote:
> >>>>
> >>>>> Internally, there are also some soft capabilities.  These include
> using
> >>>>> planner.max.width.per.node and queues:
> >>>>>
> >>>>>
> >>
> https://drill.apache.org/docs/configuring-resources-for-a-shared-drillbit/
> >>>>>
> >>>>> --
> >>>>> Jacques Nadeau
> >>>>> CTO and Co-Founder, Dremio
> >>>>>
> >>>>> On Tue, Aug 4, 2015 at 6:38 AM, John Omernik <[email protected]>
> wrote:
> >>>>>
> >>>>>> I am looking to work with drill in a managed cluster (having it play
> >> nice
> >>>>>> with Mesos).  While I can limit the ram in the drill-env.sh, the CPU
> >> is
> >>>>> not
> >>>>>> limitable, therefore, drill can just grab all the CPU resources it
> >> wants.
> >>>>>> Is there any plans to include some self limiting to Drill on CPU
> >>>>> resources?
> >>>>>> In the docs it says use CGroups, which I need to read up on, but
> >>>>> frameworks
> >>>>>> like Spark and Impala allow you to set the CPU resources in the
> >>>>> framework.
> >>>>>> Is CGroups going to get me similar behavior to those? Are there
> >>>>>> disadvantages to setting these resources in drill itself?
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> John
> >>>>>>
> >>>>>
> >>
> >>
>
>

Re: CPU Resource Management

Reply via email to