[
https://issues.apache.org/jira/browse/SPARK-10293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-10293:
---------------------------------
Labels: bulk-closed (was: )
> Add support for oversubscription in Mesos
> -----------------------------------------
>
> Key: SPARK-10293
> URL: https://issues.apache.org/jira/browse/SPARK-10293
> Project: Spark
> Issue Type: Story
> Components: Mesos
> Reporter: Chris Bannister
> Priority: Major
> Labels: bulk-closed
>
> Currently when running Spark on Mesos each executor will use all the CPU
> resources offered to it. This can lead to cases where a Spark executor is
> using all the CPU resources on a single slave but is underutilising the CPU
> allocated to it.
> Mesos added support in 0.23 for oversubscription, where frameworks can be
> offered slack resources for CPU resources already allocated. So that if a
> task is allocated 10 cpus but is only using 1, 9 revokable offers will be
> made to other frameworks. If the original task starts using its allocated CPU
> then Mesos will preempt the revokable task, killing it.
> From a cluster usage perspective it would be very useful to be able to
> specify that some jobs are revokable and can be ran in slack resources, and
> that they should be rescheduled without affecting the job status (ie not
> count towards job failure) when a task is revoked.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]