[jira] [Commented] (MESOS-354) oversubscribe resources

Niklas Quarfot Nielsen (JIRA) Wed, 28 Jan 2015 12:51:48 -0800

    [ 
https://issues.apache.org/jira/browse/MESOS-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14295839#comment-14295839
 ]


Niklas Quarfot Nielsen commented on MESOS-354:
----------------------------------------------

Oversubscription means many things and can be considered as a subset of the 
currently ongoing effort in optimistic offers:
Where optimistic offers lets the allocator to offer resources:

 - In multiple frameworks to increase 'parallelism' (opposed to the 
conservative/pessimistic scheme) and **increase task throughput**.
 - Preemptable resources from unallocated but reserved resources, to **limit 
reservation slack** (difference between reserverd and allocated resources).

A third (and equally important) case, which expands these scenarios is 
oversubscription of _allocated_ resources which limits the **usage slack** 
(difference between allocated and used resources).
There has been a lot of recent research which shows the ability to reduce usage 
slack with 60% while maintaining the Service Level Objective (SLO) of latency 
critical workloads(1). However, this kind of oversubscription needs policies 
and fine-tuning to make sure that best-effort tasks doesn't interfere with 
latency critical ones. Therefore, we'd like to start a discussion on how such a 
system would look in Mesos. I will create a JIRA ticket (linking to this one) 
to start the conversation.

(1) 
http://static.googleusercontent.com/media/research.google.com/en/us/pubs/archive/43017.pdf

> oversubscribe resources
> -----------------------
>
>                 Key: MESOS-354
>                 URL: https://issues.apache.org/jira/browse/MESOS-354
>             Project: Mesos
>          Issue Type: Story
>          Components: isolation, master, slave
>            Reporter: brian wickman
>            Priority: Minor
>         Attachments: mesos_virtual_offers.pdf
>
>
> This proposal is predicated upon offer revocation.
> The idea would be to add a new "revoked" status either by (1) piggybacking 
> off an existing status update (TASK_LOST or TASK_KILLED) or (2) introducing a 
> new status update TASK_REVOKED.
> In order to augment an offer with metadata about revocability, there are 
> options:
>   1) Add a revocable boolean to the Offer and
>     a) offer only one type of Offer per slave at a particular time
>     b) offer both revocable and non-revocable resources at the same time but 
> require frameworks to understand that Offers can contain overlapping resources
>   2) Add a revocable_resources field on the Offer which is a superset of the 
> regular resources field.  By consuming > resources <= revocable_resources in 
> a launchTask, the Task becomes a revocable task.  If launching a task with < 
> resources, the Task is non-revocable.
> The use cases for revocable tasks are batch tasks (e.g. hadoop/pig/mapreduce) 
> and non-revocable tasks are online higher-SLA tasks (e.g. services.)
> Consider a non-revocable that asks for 4 cores, 8 GB RAM and 20 GB of disk.  
> One of these resources is a rate (4 cpu seconds per second) and two of them 
> are fixed values (8GB and 20GB respectively, though disk resources can be 
> further broken down into spindles - fixed - and iops - a rate.)  In practice, 
> these are the maximum resources in the respective dimensions that this task 
> will use.  In reality, we provision tasks at some factor below peak, and only 
> hit peak resource consumption in rare circumstances or perhaps at a diurnal 
> peak.  
> In the meantime, we stand to gain from offering the some constant factor of 
> the difference between (reserved - actual) of non-revocable tasks as 
> revocable resources, depending upon our tolerance for revocable task churn.  
> The main challenge is coming up with an accurate short / medium / long-term 
> prediction of resource consumption based upon current behavior.
> In many cases it would be OK to be sloppy:
>   * CPU / iops / network IO are rates (compressible) and can often be OK 
> below guarantees for brief periods of time while task revocation takes place
>   * Memory slack can be provided by enabling swap and dynamically setting 
> swap paging boundaries.  Should swap ever be activated, that would be a 
> signal to revoke.
> The master / allocator would piggyback on the slave heartbeat mechanism to 
> learn of the amount of revocable resources available at any point in time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-354) oversubscribe resources

Reply via email to