Benjamin Hindman created MESOS-1607:
---------------------------------------
Summary: Introduce optimistic offers.
Key: MESOS-1607
URL: https://issues.apache.org/jira/browse/MESOS-1607
Project: Mesos
Issue Type: Epic
Components: allocation, framework, master
Reporter: Benjamin Hindman
The current implementation of resource offers only enable a single framework
scheduler to make scheduling decisions for some available resources at a time.
In some circumstances, this is good, i.e., when we don't want other framework
schedulers to have access to some resources. However, in other circumstances,
there are advantages to letting multiple framework schedulers attempt to make
scheduling decisions for the _same_ allocation of resources in parallel.
If you think about this from a "concurrency control" perspective, the current
implementation of resource offers is _pessimistic_, the resources contained
within an offer are _locked_ until the framework scheduler that they were
offered to launches tasks with them or declines them. In addition to making
pessimistic offers we'd like to give out _optimistic_ offers, where the same
resources are offered to multiple framework schedulers at the same time, and
framework schedulers "compete" for those resources on a first-come-first-serve
basis (i.e., the first to launch a task "wins"). We've always reserved the
right to rescind resource offers using the 'rescind' primitive in the API, and
a framework scheduler should be prepared to launch a task and have those tasks
go lost because another framework already started to use those resources.
Introducing optimistic offers will enable more sophisticated allocation
algorithms. For example, we can optimistically allocate resources that are
reserved for a particular framework (role) but are not being used. In
conjunction with revocable resources (the concept that using resources not
reserved for you means you might get those resources revoked) we can easily
create a "spot" market for unused resources, driving up utilization by letting
frameworks that are willing to use revocable resources run tasks.
In the limit, one could imagine always making optimistic resource offers. This
bears a striking resemblance with the Google Omega model (an isomorphism even).
However, being able to configure what resources should be allocated
optimistically and what resources should be allocated pessimistically gives
even more control to a datacenter/cluster operator that might want to, for
example, never let multiple frameworks (roles) compete for some set of
resources.
--
This message was sent by Atlassian JIRA
(v6.2#6252)