Kay Ousterhout created SPARK-2294:
-------------------------------------

             Summary: TaskSchedulerImpl and TaskSetManager do not properly 
prioritize which tasks get assigned to an executor
                 Key: SPARK-2294
                 URL: https://issues.apache.org/jira/browse/SPARK-2294
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 1.0.0, 1.0.1
            Reporter: Kay Ousterhout


If an executor E is free, a task may be speculatively assigned to E when there 
are other tasks in the job that have not been launched (at all) yet.  
Similarly, a task without any locality preferences may be assigned to E when 
there was another NODE_LOCAL task that could have been scheduled. 

This happens because TaskSchedulerImpl calls TaskSetManager.resourceOffer 
(which in turn calls TaskSetManager.findTask) with increasing locality levels, 
beginning with PROCESS_LOCAL, followed by NODE_LOCAL, and so on until the 
highest currently allowed level.  Now, supposed NODE_LOCAL is the highest 
currently allowed locality level.  The first time findTask is called, it will 
be called with max level PROCESS_LOCAL; if it cannot find any PROCESS_LOCAL 
tasks, it will try to schedule tasks with no locality preferences or 
speculative tasks.  As a result, speculative tasks or tasks with no preferences 
may be scheduled instead of NODE_LOCAL tasks.

cc [~matei]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to