As an FYI, preliminary support to work around this issue for GPUs will
appear in the 1.0 release
https://reviews.apache.org/r/48914/

This doesn't solve the problem of scarce resources in general, but it
will at least keep non-GPU workloads from starving out GPU-based
workloads on GPU capable machines. The downside of this approach is
that only GPU aware frameworks will be able to launch stuff on GPU
capable machines (meaning some of their resources could go unused
unnecessarily).  We decided this tradeoff is acceptable for now.

Kevin

On Tue, Jun 21, 2016 at 1:40 PM, Elizabeth Lingg
<elizabeth_li...@apple.com> wrote:
> Thanks, looking forward to discussion and review on your document. The main 
> use case I see here is that some of our frameworks will want to request the 
> GPU resources, and we want to make sure that those frameworks are able to 
> successfully launch tasks on agents with those resources. We want to be 
> certain that other frameworks that do not require GPU’s will not request all 
> other resources on those agents (i.e. cpu, disk, memory) which would mean the 
> GPU resources are not allocated and the frameworks that require them will not 
> receive them. As Ben Mahler mentioned, "(2) Because we do not have revocation 
> yet, if a framework decides to consume the non-GPU resources on a GPU 
> machine, it will prevent the GPU workloads from running!” This will occur for 
> us in clusters where we have higher utilization as well as different types of 
> workloads running. Smart task placement then becomes more relevant (i.e. we 
> want to be able to schedule with scarce resources successfully and we may 
> have considerations like not scheduling too many I/O bound workloads on a 
> single host or more stringent requirements for scheduling persistent tasks).
>
>  Elizabeth Lingg
>
>
>
>> On Jun 20, 2016, at 7:24 PM, Guangya Liu <gyliu...@gmail.com> wrote:
>>
>> Had some discussion with Ben M, for the following two solutions:
>>
>> 1) Ben M: Create sub-pools of resources based on machine profile and
>> perform fair sharing / quota within each pool plus a framework
>> capability GPU_AWARE
>> to enable allocator filter out scarce resources for some frameworks.
>> 2) Guangya: Adding new sorters for non scarce resources plus a framework
>> capability GPU_AWARE to enable allocator filter out scarce resources for
>> some frameworks.
>>
>> Both of the above two solutions are meaning same thing and there is no
>> difference between those two solutions: Create sub-pools of resources will
>> need to introduce different sorters for each sub-pools, so I will merge
>> those two solutions to one.
>>
>> Also had some dicsussion with Ben for AlexR's solution of implementing
>> "requestResource", this API should be treated as an improvement to the
>> issues of doing resource allocation pessimistically. (e.g. we offer/decline
>> the GPUs to 1000 frameworks before offering it to the GPU framework that
>> wants it). And the "requestResource" is providing *more information* to
>> mesos. Namely, it gives us awareness of demand.
>>
>> Even though for some cases, we can use the "requestResource" to get all of
>> the scarce resources, and then once those scarce resources are in use, then
>> the WDRF sorter will sorter non scarce resources as normal, but the problem
>> is that we cannot guarantee that the framework which have "requestResource"
>> can always consume all of the scarce resources before those scarce resource
>> allocated to other frameworks.
>>
>> I'm planning to draft a document based on solution 1) "Create sub-pools"
>> for the long term solution, any comments are welcome!
>>
>> Thanks,
>>
>> Guangya
>>
>> On Sat, Jun 18, 2016 at 11:58 AM, Guangya Liu <gyliu...@gmail.com> wrote:
>>
>>> Thanks Du Fan. So you mean that we should have some clear rules in
>>> document or somewhere else to tell or guide cluster admin which resources
>>> should be classified as scarce resources, right?
>>>
>>> On Sat, Jun 18, 2016 at 2:38 AM, Du, Fan <fan...@intel.com> wrote:
>>>
>>>>
>>>>
>>>> On 2016/6/17 7:57, Guangya Liu wrote:
>>>>
>>>>> @Fan Du,
>>>>>
>>>>> Currently, I think that the scarce resources should be defined by cluster
>>>>> admin, s/he can specify those scarce resources via a flag when master
>>>>> start
>>>>> up.
>>>>>
>>>>
>>>> This is not what I mean.
>>>> IMO, it's not cluster admin's call to decide what resources should be
>>>> marked as scarce , they can carry out the operation, but should be advised
>>>> on based on the clear rule: to what extend the resource is scarce compared
>>>> with other resources, and it will affect wDRF by causing starvation for
>>>> frameworks which holds scarce resources, that's my point.
>>>>
>>>> To my best knowledge here, a quantitative study of how wDRF behaves in
>>>> scenario of one/multiple scarce resources first will help to verify the
>>>> proposed approach, and guide the user of this functionality.
>>>>
>>>>
>>>>
>>>> Regarding to the proposal of generic scarce resources, do you have any
>>>>> thoughts on this? I can see that giving framework developers the options
>>>>> of
>>>>> define scarce resources may bring trouble to mesos, it is better to let
>>>>> mesos define those scarce but not framework developer.
>>>>>
>>>>
>>>
>



-- 
~Kevin

Reply via email to