[
https://issues.apache.org/jira/browse/MESOS-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588632#comment-13588632
]
Benjamin Mahler commented on MESOS-354:
---------------------------------------
If we introduce additional offers for a slave, considered revokable via a
boolean flag, existing schedulers will not be checking this flag and will
schedule on the offer on the assumption that the offer semantics haven't
changed. Perhaps not a huge issue given there are not a lot of production
schedulers out there, but certainly something to keep in mind.
I think I like the explicitness of revokable offers being separate, since they
are to be considered more volatile.
> oversubscribe resources
> -----------------------
>
> Key: MESOS-354
> URL: https://issues.apache.org/jira/browse/MESOS-354
> Project: Mesos
> Issue Type: New Feature
> Components: isolation, master, slave
> Reporter: brian wickman
> Priority: Minor
> Attachments: mesos_virtual_offers.pdf
>
>
> This proposal is predicated upon offer revocation.
> The idea would be to add a new "revoked" status either by (1) piggybacking
> off an existing status update (TASK_LOST or TASK_KILLED) or (2) introducing a
> new status update TASK_REVOKED.
> In order to augment an offer with metadata about revocability, there are
> options:
> 1) Add a revocable boolean to the Offer and
> a) offer only one type of Offer per slave at a particular time
> b) offer both revocable and non-revocable resources at the same time but
> require frameworks to understand that Offers can contain overlapping resources
> 2) Add a revocable_resources field on the Offer which is a superset of the
> regular resources field. By consuming > resources <= revocable_resources in
> a launchTask, the Task becomes a revocable task. If launching a task with <
> resources, the Task is non-revocable.
> The use cases for revocable tasks are batch tasks (e.g. hadoop/pig/mapreduce)
> and non-revocable tasks are online higher-SLA tasks (e.g. services.)
> Consider a non-revocable that asks for 4 cores, 8 GB RAM and 20 GB of disk.
> One of these resources is a rate (4 cpu seconds per second) and two of them
> are fixed values (8GB and 20GB respectively, though disk resources can be
> further broken down into spindles - fixed - and iops - a rate.) In practice,
> these are the maximum resources in the respective dimensions that this task
> will use. In reality, we provision tasks at some factor below peak, and only
> hit peak resource consumption in rare circumstances or perhaps at a diurnal
> peak.
> In the meantime, we stand to gain from offering the some constant factor of
> the difference between (reserved - actual) of non-revocable tasks as
> revocable resources, depending upon our tolerance for revocable task churn.
> The main challenge is coming up with an accurate short / medium / long-term
> prediction of resource consumption based upon current behavior.
> In many cases it would be OK to be sloppy:
> * CPU / iops / network IO are rates (compressible) and can often be OK
> below guarantees for brief periods of time while task revocation takes place
> * Memory slack can be provided by enabling swap and dynamically setting
> swap paging boundaries. Should swap ever be activated, that would be a
> signal to revoke.
> The master / allocator would piggyback on the slave heartbeat mechanism to
> learn of the amount of revocable resources available at any point in time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira