Benjamin Mahler created MESOS-5527:
--------------------------------------
Summary: Provide work conservation incentives for schedulers.
Key: MESOS-5527
URL: https://issues.apache.org/jira/browse/MESOS-5527
Project: Mesos
Issue Type: Epic
Components: allocation, framework
Reporter: Benjamin Mahler
As we begin to add support for schedulers to revoke resources to obtain their
quota or fair share, we need to consider the case of non-cooperative or
malicious schedulers that cause excessive revocation either by accident or
intentionally.
For example, a malicious scheduler could keep a low allocation below its fair
share, and revoke as many resources as it can in order to disturb existing work
as much as possible.
We can provide mitigation techniques, or incentives / penalties to schedulers
that cause excessive revocation:
* Disallow revocation when a scheduler resources are available. The scheduler
must choose available resources or wait until allocated resources free up. This
means picky schedulers may not obtain the resources they want.
* Penalize schedulers causing excessive revocation in order to incentivize them
to play nicely.
* Use a degree of pessimism to restrict which resources a scheduler can revoke
(e.g. only batch tasks that have not been running for a long time). If we
augment task information to know whether it is a service or a batch job we may
be able to do better here.
* etc
The techniques employed for work conservation in the presence of revocation
should be configurable, and users should be able to achieve their own custom
work conservation policies by implementing an allocator (or a subcomponent of
the existing allocator).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)