Thomas Marshall created MESOS-489:
-------------------------------------

             Summary: Fix Hierarchical DRF to work correctly
                 Key: MESOS-489
                 URL: https://issues.apache.org/jira/browse/MESOS-489
             Project: Mesos
          Issue Type: Bug
            Reporter: Thomas Marshall


The current naive implementation of hierarchical DRF can cause frameworks to 
starve in certain situations.

For example, imagine a cluster with two resources, cpu and gpu, and two roles, 
r1 and r2. r1 has a single framework f1 which only wants cpu, and r2 has two 
frameworks, f2 which only wants gpu and f3 which wants both.

Assume that each role initially starts with 50% of all resources. f1 will 
return its share of gpus since it doesn't need them, which will then be offered 
to r2 and accepted by f2, putting r2's share above 0.5. Similarly, f2 will 
return the cpus it doesn't need which will be offered to and accepted by f1. 
Even if either of those offers is given to f3, it cannot accept them since 
neither contains both resources that f3 needs. This pattern will continue until 
f1 has all of the cpus, f2 has all of the gpus, and f3 is starved.

For more detail, see Ali Ghodsi.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to