@Fan, In the community meeting a question was raised around which frameworks might be ready to use this. Can you provide some more context for immediate use cases on the framework side?
— *Joris Van Remoortere* Mesosphere On Fri, Jun 17, 2016 at 12:51 AM, Du, Fan <fan...@intel.com> wrote: > A couple of rough thoughts in the early morning: > > a. Is there any quantitative way to decide a resource is kind of scare? I > mean how to aid operator to make this decision to use/not use this > functionality when deploying mesos. > > b. Scare resource extend from GPU to, name a few, Xeon Phi, FPGA, what > about make the proposal more generic and future proof? > > > > On 2016/6/11 10:50, Benjamin Mahler wrote: > >> I wanted to start a discussion about the allocation of "scarce" resources. >> "Scarce" in this context means resources that are not present on every >> machine. GPUs are the first example of a scarce resource that we support >> as >> a known resource type. >> >> Consider the behavior when there are the following agents in a cluster: >> >> 999 agents with (cpus:4,mem:1024,disk:1024) >> 1 agent with (gpus:1,cpus:4,mem:1024,disk:1024) >> >> Here there are 1000 machines but only 1 has GPUs. We call GPUs a "scarce" >> resource here because they are only present on a small percentage of the >> machines. >> >> We end up with some problematic behavior here with our current allocation >> model: >> >> (1) If a role wishes to use both GPU and non-GPU resources for tasks, >> consuming 1 GPU will lead DRF to consider the role to have a 100% share of >> the cluster, since it consumes 100% of the GPUs in the cluster. This >> framework will then not receive any other offers. >> >> (2) Because we do not have revocation yet, if a framework decides to >> consume the non-GPU resources on a GPU machine, it will prevent the GPU >> workloads from running! >> >> -------- >> >> I filed an epic [1] to track this. The plan for the short-term is to >> introduce two mechanisms to mitigate these issues: >> >> -Introduce a resource fairness exclusion list. This allows the shares >> of resources like "gpus" to be excluded from the dominant share. >> >> -Introduce a GPU_AWARE framework capability. This indicates that the >> scheduler is aware of GPUs and will schedule tasks accordingly. Old >> schedulers will not have the capability and will not receive any offers >> for >> GPU machines. If a scheduler has the capability, we'll advise that they >> avoid placing their additional non-GPU workloads on the GPU machines. >> >> -------- >> >> Longer term, we'll want a more robust way to manage scarce resources. The >> first thought we had was to have sub-pools of resources based on machine >> profile and perform fair sharing / quota within each pool. This addresses >> (1) cleanly, and for (2) the operator needs to explicitly disallow non-GPU >> frameworks from participating in the GPU pool. >> >> Unfortunately, by excluding non-GPU frameworks from the GPU pool we may >> have a lower level of utilization. In the even longer term, as we add >> revocation it will be possible to allow a scheduler desiring GPUs to >> revoke >> the resources allocated to the non-GPU workloads running on the GPU >> machines. There are a number of things we need to put in place to support >> revocation ([2], [3], [4], etc), so I'm glossing over the details here. >> >> If anyone has any thoughts or insight in this area, please share! >> >> Ben >> >> [1] https://issues.apache.org/jira/browse/MESOS-5377 >> [2] https://issues.apache.org/jira/browse/MESOS-5524 >> [3] https://issues.apache.org/jira/browse/MESOS-5527 >> [4] https://issues.apache.org/jira/browse/MESOS-4392 >> >>