[ 
https://issues.apache.org/jira/browse/SPARK-19373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888692#comment-15888692
 ] 

Michael Gummelt commented on SPARK-19373:
-----------------------------------------

This change makes it so that the user can instruct the driver to wait for all 
executors to register before scheduling tasks.  The TaskSchedulerImpl 
understand locality, so it can then make the optimal placement. Otherwise, 
tasks are scheduled as soon as the first executor is registered, which of 
course might not be node-local for the first task.

However, this is still assuming that executors will be scheduled on the correct 
nodes, which isn't guaranteed unless you're launching executors on every node 
in your cluster.  For the best locality functionality, we need to integrate 
task locality information with dynamic allocation, so that the driver can 
dynamically spin up executors on the needed nodes.  That is outside the scope 
of this JIRA, though.

> Mesos implementation of spark.scheduler.minRegisteredResourcesRatio looks at 
> acquired cores rather than registerd cores
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-19373
>                 URL: https://issues.apache.org/jira/browse/SPARK-19373
>             Project: Spark
>          Issue Type: Bug
>          Components: Mesos
>    Affects Versions: 2.1.0
>            Reporter: Michael Gummelt
>
> We're currently using `totalCoresAcquired` to account for registered 
> resources, which is incorrect.  That variable measures the number of cores 
> the scheduler has accepted.  We should be using `totalCoreCount` like the 
> other schedulers do.
> Fixing this is important for locality, since users often want to wait for all 
> executors to come up before scheduling tasks to ensure they get a node-local 
> placement. 
> original PR to add support: https://github.com/apache/spark/pull/8672/files



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to