Benjamin Mahler created MESOS-5527:
--------------------------------------

             Summary: Provide work conservation incentives for schedulers.
                 Key: MESOS-5527
                 URL: https://issues.apache.org/jira/browse/MESOS-5527
             Project: Mesos
          Issue Type: Epic
          Components: allocation, framework
            Reporter: Benjamin Mahler


As we begin to add support for schedulers to revoke resources to obtain their 
quota or fair share, we need to consider the case of non-cooperative or 
malicious schedulers that cause excessive revocation either by accident or 
intentionally.

For example, a malicious scheduler could keep a low allocation below its fair 
share, and revoke as many resources as it can in order to disturb existing work 
as much as possible.

We can provide mitigation techniques, or incentives / penalties to schedulers 
that cause excessive revocation:
* Disallow revocation when a scheduler resources are available. The scheduler 
must choose available resources or wait until allocated resources free up. This 
means picky schedulers may not obtain the resources they want.
* Penalize schedulers causing excessive revocation in order to incentivize them 
to play nicely.
* Use a degree of pessimism to restrict which resources a scheduler can revoke 
(e.g. only batch tasks that have not been running for a long time). If we 
augment task information to know whether it is a service or a batch job we may 
be able to do better here.
* etc

The techniques employed for work conservation in the presence of revocation 
should be configurable, and users should be able to achieve their own custom 
work conservation policies by implementing an allocator (or a subcomponent of 
the existing allocator).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to