[ 
https://issues.apache.org/jira/browse/SPARK-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-10293:
---------------------------------
    Labels: bulk-closed  (was: )

> Add support for oversubscription in Mesos
> -----------------------------------------
>
>                 Key: SPARK-10293
>                 URL: https://issues.apache.org/jira/browse/SPARK-10293
>             Project: Spark
>          Issue Type: Story
>          Components: Mesos
>            Reporter: Chris Bannister
>            Priority: Major
>              Labels: bulk-closed
>
> Currently when running Spark on Mesos each executor will use all the CPU 
> resources offered to it. This can lead to cases where a Spark executor is 
> using all the CPU resources on a single slave but is underutilising the CPU 
> allocated to it.
> Mesos added support in 0.23 for oversubscription, where frameworks can be 
> offered slack resources for CPU resources already allocated. So that if a 
> task is allocated 10 cpus but is only using 1, 9 revokable offers will be 
> made to other frameworks. If the original task starts using its allocated CPU 
> then Mesos will preempt the revokable task, killing it.
> From a cluster usage perspective it would be very useful to be able to 
> specify that some jobs are revokable and can be ran in slack resources, and 
> that they should be rescheduled without affecting the job status (ie not 
> count towards job failure) when a task is revoked.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to