Re: Setting maximum per-node resources in offers

RJ Nowling Thu, 10 Sep 2015 11:22:42 -0700

Great!  I'm looking forward to seeing all these solutions.  :)

On Thu, Sep 10, 2015 at 12:37 PM, David Greenberg <[email protected]>
wrote:


> Fair sharing, preemptive spark with multitenant support is something that
> we built at two sigma to allow our users to fairly share a giant cluster.
> It allows users to use more resources on a less busy cluster, keeps things
> fair on a busy cluster, and guarantees that no small jobs need to wait for
> more than a minute or so to start. We're currently open sourcing the entire
> system, including the multitenant mesos framework and spark scheduler
> integration, and it should be ready to use in under 4 weeks.
>
> I'd recommend using our system, Cook. If you'd like to hear more about the
> math behind its scheduling algorithms, watch this talk by Li Jin at
> Mesoscon: http://youtu.be/BkBMYUe76oI
>
> After our initial open sourcing is done, we plan on integrating fenzo to
> improve Cook's bin packing and ability to do locality-aware scheduling.
> On Thu, Sep 10, 2015 at 1:05 PM Sharma Podila <[email protected]> wrote:
>
>> FYI-
>> If you are to use Fenzo in writing your framework, it has support for
>> limiting overall resources used by tasks with the use of a "group name".
>> That is, all tasks with a group name, say "userA", would be limited to
>> using the resources specified in the limit for the group. For this to work,
>> you would have to specify the limits for each user and specify each task's
>> group name as the user's name, same as in the limits. Each user can be
>> given different limits, if desired. See this
>> <https://github.com/Netflix/Fenzo/wiki/Resource-Allocation-Limits> for
>> details.
>>
>> In general, "fair share" is subjective. Quotas fragment the cluster and
>> can reduce the overall cluster utilization when only few users are active.
>> One improvement may be to treat the limits as soft limits. That is, let
>> users use resources beyond their limits if there is no contention. However,
>> for this to work well, we would need one of two things to be true:
>>
>> 1. rate of task completion is high enough that a new user will be able to
>> get resources after not using the cluster for a while, or,
>> 2. users' tasks that are consuming more resources than limits can be
>> preempted when needed for other users.
>>
>> The quota management in Mesos, that Guangya gave the link for, seems to
>> address some of these concerns. My understanding is that the MVP is going
>> to be the equivalent of hard limits.
>>
>>
>>
>> On Tue, Sep 8, 2015 at 11:55 PM, Guangya Liu <[email protected]> wrote:
>>
>>> Great that it helps!
>>>
>>> I think that it is a bit heavy to running Spark+Aurora+Mesos, but you
>>> can have a try if it can fill your requirement. ;-)
>>>
>>> In my understanding, I think that what you may want to have a try with
>>> Spark + (Customized Spark Scheduler, leverage Fenzo or others) + Mesos, but
>>> this may involve some code change for spark.
>>>
>>> Thanks,
>>>
>>> Guangya
>>>
>>> On Wed, Sep 9, 2015 at 2:05 PM, RJ Nowling <[email protected]> wrote:
>>>
>>>> Thanks, Guangya!
>>>>
>>>> Inspired by your comments, I've also been thinking about the option of
>>>> using Apache Aurora to provide some of the features I want.  Spark could be
>>>> deployed in standalone mode on top of Aurora on top of Mesos. :)
>>>>
>>>> Funny enough, two of my colleagues (Tim St. Clair and Erik Erlandson)
>>>> seem to be tracking and commenting on the epic you linked to.  :)
>>>>
>>>> On Wed, Sep 9, 2015 at 12:59 AM, Guangya Liu <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi RJ, please check my answers in line.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Guangya
>>>>>
>>>>> On Wed, Sep 9, 2015 at 1:24 PM, RJ Nowling <[email protected]> wrote:
>>>>>
>>>>>> Hi Guangya,
>>>>>>
>>>>>> My use case is actually trying to run Spark (in coarse grain mode)
>>>>>> with multiple users. I wanted ways to better ensure fair scheduling 
>>>>>> across
>>>>>> users. Spark provides very few primitives so I was hoping I could use 
>>>>>> Mesos
>>>>>> to limit resources per user and control how the cluster is partitioned. 
>>>>>> For
>>>>>> example, I may prefer that a Spark jobs share multiple machines without
>>>>>> using all resources on a single machine for fault tolerance.
>>>>>>
>>>>> For this scenario, you may want to schedule those offered resource
>>>>> again in framework level, you can leverage fenzo or what ever to enhance
>>>>> the scheduler part for spark to achieve your goal.
>>>>>
>>>>>>
>>>>>> I'm also considering the case of running multiple frameworks. In this
>>>>>> case, frameworks would have to coordinate to enforce user quotas and 
>>>>>> such.
>>>>>> It seems that this would be better solved somewhere below the framework
>>>>>> level.
>>>>>>
>>>>> For this scenario, there is an epic for "quota management" which can
>>>>> fill your requirement but it is still undergoing and not available now.
>>>>> epic: https://issues.apache.org/jira/browse/MESOS-1791
>>>>> Design doc:
>>>>> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?pli=1#heading=h.9g7fqjh6652v
>>>>>
>>>>>>
>>>>>> RJ
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sep 8, 2015, at 11:47 PM, Guangya Liu <[email protected]> wrote:
>>>>>>
>>>>>> Hi RJ,
>>>>>>
>>>>>> I think that your final goal is that you want to use framework
>>>>>> running on top of mesos to execute some tasks. Such logic should be in 
>>>>>> the
>>>>>> framework part. The netflix open sourced a framework scheduler library
>>>>>> named as fenzo, you may want to take a look at this one to see if it can
>>>>>> help you.
>>>>>>
>>>>>>
>>>>>> http://techblog.netflix.com/2015/08/fenzo-oss-scheduler-for-apache-mesos.html
>>>>>> https://github.com/Netflix/Fenzo
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Guangya
>>>>>>
>>>>>> ------------------------------
>>>>>> Date: Tue, 8 Sep 2015 23:09:36 -0500
>>>>>> Subject: Re: Setting maximum per-node resources in offers
>>>>>> From: [email protected]
>>>>>> To: [email protected]
>>>>>>
>>>>>> Thanks, Klaus.
>>>>>>
>>>>>> I think I was probably misunderstanding the role of the allocator in
>>>>>> Mesos versus the scheduler in the framework sitting on top of Mesos.
>>>>>> Probably out of scope for Mesos to divide up resources as I was 
>>>>>> suggesting.
>>>>>>
>>>>>> On Tue, Sep 8, 2015 at 10:48 PM, Klaus Ma <[email protected]> wrote:
>>>>>>
>>>>>> If it's the only framework, you will receive all nodes from Mesos as
>>>>>> offers. You can re-schedule those resources to run tasks on each node.
>>>>>>
>>>>>>
>>>>>> On 2015年09月09日 03:03, RJ Nowling wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have a smallish cluster with a lot of cores and RAM per node.  I
>>>>>> want to support multiple users so I'd like to set up Mesos to provide a
>>>>>> maximum of 8 cores per node in the resource offers.  Resource offers 
>>>>>> should
>>>>>> include multiple nodes to reach the requirements of the user.  For 
>>>>>> example,
>>>>>> if the user requests 32 cores, I would like 8 cores from each of 4 nodes.
>>>>>>
>>>>>> Is this possible?  Or can someone suggest alternatives?
>>>>>>
>>>>>> Thanks,
>>>>>> RJ
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Klaus Ma (马达), PMP® | http://www.cguru.net
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>

Re: Setting maximum per-node resources in offers

Reply via email to