[jira] [Comment Edited] (MESOS-7375) provide additional insight for framework developers re: GPU_RESOURCES capability

Kevin Klues (JIRA) Tue, 25 Apr 2017 14:25:21 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-7375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983645#comment-15983645
 ]


Kevin Klues edited comment on MESOS-7375 at 4/25/17 9:24 PM:
-------------------------------------------------------------

The flag you are thinking of is 
{{\-\-allocator_fairness_excluded_resource_names}} (i.e. you can set it as 
{{\-\-allocator_fairness_excluded_resource_names=gpus}}).

Regarding motivation for the GPU_RESOURCES capability-- here is an excerpt from 
an email I sent out recently:

"""
Ideally, marathon (and any other frameworks -- SDK include) should do some sort 
of preferential scheduling when they opt-in to use GPUs.  That is, they should 
*prefer* to run GPU jobs on GPU machines and non-GPU jobs on non-GPU machines 
(falling back to running them on GPU machines only if that is all that is 
available).

Additionally, we need a way for an operator to indicate whether GPUs are a 
scarce resource in their cluster or not. We have a flag in mesos that allows us 
to set this ( {{\-\-allocator_fairness_excluded_resource_names=gpus}}), but we 
don't yet have a way of setting this through DC/OS. If we don't set this flag, 
we run the risk of Mesos's DRF algorithm choosing to very rarely send out 
offers from GPU machines once the first GPU job has been launched on them.

As a concrete example, imagine you have a machine with only 1 GPU and you 
launch a task that consumes it -- from DRF's perspective that node now has 100% 
usage of one of its resources. Even if you have 2 GPUs, and one gets consumed, 
DRF still thinks you have consumed 50% of one of its resources. Out of 
fairness, DRF will choose not to send offers from you until some other resource 
on *all* other nodes approaches 50% as well (which may take a while if you are 
allocating CPUs, memory, and disk in small increments).

Right now we don't set {{\-\-allocator_fairness_excluded_resource_names=gpus}} 
in DC/OS (but maybe we should?). Is it the case that most DC/OS users only 
install GPUs on a small number of nodes in their cluster? If so, we should 
consider it a scarce resource and set this flag by default. If not, then GPUs 
aren't actually a scarce resource and we shouldn't be setting this flag -- DRF 
will perform as expected without it.
"""


was (Author: klueska):
The flag you are thinking of is 
{{\-\-allocator_fairness_excluded_resource_names}} (i.e. you can set it as 
{{\-\-allocator_fairness_excluded_resource_names=gpus}}).

Regarding motivation for the GPU_RESOURCES capability-- here is an excerpt from 
an email I sent out recently:

"""
Ideally, marathon (and any other frameworks -- SDK include) should do some sort 
of preferential scheduling when they opt-in to use GPUs.  That is, they should 
*prefer* to run GPU jobs on GPU machines and non-GPU jobs on non-GPU machines 
(falling back to running them on GPU machines only if that is all that is 
available).

Additionally, we need a way for an operator to indicate whether GPUs are a 
scarce resource in their cluster or not. We have a flag in mesos that allows us 
to set this ( `--allocator_fairness_excluded_resource_names=gpus`), but we 
don't yet have a way of setting this through DC/OS. If we don't set this flag, 
we run the risk of Mesos's DRF algorithm choosing to very rarely send out 
offers from GPU machines once the first GPU job has been launched on them.

As a concrete example, imagine you have a machine with only 1 GPU and you 
launch a task that consumes it -- from DRF's perspective that node now has 100% 
usage of one of its resources. Even if you have 2 GPUs, and one gets consumed, 
DRF still thinks you have consumed 50% of one of its resources. Out of 
fairness, DRF will choose not to send offers from you until some other resource 
on *all* other nodes approaches 50% as well (which may take a while if you are 
allocating CPUs, memory, and disk in small increments).

Right now we don't set {{\-\-allocator_fairness_excluded_resource_names=gpus}} 
in DC/OS (but maybe we should?). Is it the case that most DC/OS users only 
install GPUs on a small number of nodes in their cluster? If so, we should 
consider it a scarce resource and set this flag by default. If not, then GPUs 
aren't actually a scarce resource and we shouldn't be setting this flag -- DRF 
will perform as expected without it.
"""

> provide additional insight for framework developers re: GPU_RESOURCES 
> capability
> --------------------------------------------------------------------------------
>
>                 Key: MESOS-7375
>                 URL: https://issues.apache.org/jira/browse/MESOS-7375
>             Project: Mesos
>          Issue Type: Bug
>          Components: allocation
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> On clusters where all nodes are equal and every node has a GPU, frameworks 
> that **don't** opt-in to the `GPU_RESOURCES` capability won't get any offers. 
> This is surprising for operators.
> Even when a framework doesn't **need** GPU resources, it may make sense for a 
> framework scheduler to provide a `--gpu-cluster-compat` (or similar) flag 
> that results in the framework advertising the `GPU_RESOURCES` capability even 
> though it does not intend to consume any GPU. The effect being that said 
> framework will now receive offers on clusters where all nodes have GPU 
> resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (MESOS-7375) provide additional insight for framework developers re: GPU_RESOURCES capability

Reply via email to