FYI-
If you are to use Fenzo in writing your framework, it has support for
limiting overall resources used by tasks with the use of a "group name".
That is, all tasks with a group name, say "userA", would be limited to
using the resources specified in the limit for the group. For this to work,
you would have to specify the limits for each user and specify each task's
group name as the user's name, same as in the limits. Each user can be
given different limits, if desired. See this
<https://github.com/Netflix/Fenzo/wiki/Resource-Allocation-Limits> for
details.

In general, "fair share" is subjective. Quotas fragment the cluster and can
reduce the overall cluster utilization when only few users are active. One
improvement may be to treat the limits as soft limits. That is, let users
use resources beyond their limits if there is no contention. However, for
this to work well, we would need one of two things to be true:

1. rate of task completion is high enough that a new user will be able to
get resources after not using the cluster for a while, or,
2. users' tasks that are consuming more resources than limits can be
preempted when needed for other users.

The quota management in Mesos, that Guangya gave the link for, seems to
address some of these concerns. My understanding is that the MVP is going
to be the equivalent of hard limits.



On Tue, Sep 8, 2015 at 11:55 PM, Guangya Liu <[email protected]> wrote:

> Great that it helps!
>
> I think that it is a bit heavy to running Spark+Aurora+Mesos, but you can
> have a try if it can fill your requirement. ;-)
>
> In my understanding, I think that what you may want to have a try with
> Spark + (Customized Spark Scheduler, leverage Fenzo or others) + Mesos, but
> this may involve some code change for spark.
>
> Thanks,
>
> Guangya
>
> On Wed, Sep 9, 2015 at 2:05 PM, RJ Nowling <[email protected]> wrote:
>
>> Thanks, Guangya!
>>
>> Inspired by your comments, I've also been thinking about the option of
>> using Apache Aurora to provide some of the features I want.  Spark could be
>> deployed in standalone mode on top of Aurora on top of Mesos. :)
>>
>> Funny enough, two of my colleagues (Tim St. Clair and Erik Erlandson)
>> seem to be tracking and commenting on the epic you linked to.  :)
>>
>> On Wed, Sep 9, 2015 at 12:59 AM, Guangya Liu <[email protected]> wrote:
>>
>>> Hi RJ, please check my answers in line.
>>>
>>> Thanks,
>>>
>>> Guangya
>>>
>>> On Wed, Sep 9, 2015 at 1:24 PM, RJ Nowling <[email protected]> wrote:
>>>
>>>> Hi Guangya,
>>>>
>>>> My use case is actually trying to run Spark (in coarse grain mode) with
>>>> multiple users. I wanted ways to better ensure fair scheduling across
>>>> users. Spark provides very few primitives so I was hoping I could use Mesos
>>>> to limit resources per user and control how the cluster is partitioned. For
>>>> example, I may prefer that a Spark jobs share multiple machines without
>>>> using all resources on a single machine for fault tolerance.
>>>>
>>> For this scenario, you may want to schedule those offered resource again
>>> in framework level, you can leverage fenzo or what ever to enhance the
>>> scheduler part for spark to achieve your goal.
>>>
>>>>
>>>> I'm also considering the case of running multiple frameworks. In this
>>>> case, frameworks would have to coordinate to enforce user quotas and such.
>>>> It seems that this would be better solved somewhere below the framework
>>>> level.
>>>>
>>> For this scenario, there is an epic for "quota management" which can
>>> fill your requirement but it is still undergoing and not available now.
>>> epic: https://issues.apache.org/jira/browse/MESOS-1791
>>> Design doc:
>>> https://docs.google.com/document/d/16iRNmziasEjVOblYp5bbkeBZ7pnjNlaIzPQqMTHQ-9I/edit?pli=1#heading=h.9g7fqjh6652v
>>>
>>>>
>>>> RJ
>>>>
>>>>
>>>>
>>>> On Sep 8, 2015, at 11:47 PM, Guangya Liu <[email protected]> wrote:
>>>>
>>>> Hi RJ,
>>>>
>>>> I think that your final goal is that you want to use framework running
>>>> on top of mesos to execute some tasks. Such logic should be in the
>>>> framework part. The netflix open sourced a framework scheduler library
>>>> named as fenzo, you may want to take a look at this one to see if it can
>>>> help you.
>>>>
>>>>
>>>> http://techblog.netflix.com/2015/08/fenzo-oss-scheduler-for-apache-mesos.html
>>>> https://github.com/Netflix/Fenzo
>>>>
>>>> Thanks,
>>>>
>>>> Guangya
>>>>
>>>> ------------------------------
>>>> Date: Tue, 8 Sep 2015 23:09:36 -0500
>>>> Subject: Re: Setting maximum per-node resources in offers
>>>> From: [email protected]
>>>> To: [email protected]
>>>>
>>>> Thanks, Klaus.
>>>>
>>>> I think I was probably misunderstanding the role of the allocator in
>>>> Mesos versus the scheduler in the framework sitting on top of Mesos.
>>>> Probably out of scope for Mesos to divide up resources as I was suggesting.
>>>>
>>>> On Tue, Sep 8, 2015 at 10:48 PM, Klaus Ma <[email protected]> wrote:
>>>>
>>>> If it's the only framework, you will receive all nodes from Mesos as
>>>> offers. You can re-schedule those resources to run tasks on each node.
>>>>
>>>>
>>>> On 2015年09月09日 03:03, RJ Nowling wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I have a smallish cluster with a lot of cores and RAM per node.  I want
>>>> to support multiple users so I'd like to set up Mesos to provide a maximum
>>>> of 8 cores per node in the resource offers.  Resource offers should include
>>>> multiple nodes to reach the requirements of the user.  For example, if the
>>>> user requests 32 cores, I would like 8 cores from each of 4 nodes.
>>>>
>>>> Is this possible?  Or can someone suggest alternatives?
>>>>
>>>> Thanks,
>>>> RJ
>>>>
>>>>
>>>> --
>>>> Klaus Ma (马达), PMP® | http://www.cguru.net
>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to