GitHub user CodingCat opened a pull request:
https://github.com/apache/spark/pull/1313
SPARK-2294: fix locality inversion bug in TaskManager
copied from original JIRA
(https://issues.apache.org/jira/browse/SPARK-2294):
If an executor E is free, a task may be speculatively assigned to E when
there are other tasks in the job that have not been launched (at all) yet.
Similarly, a task without any locality preferences may be assigned to E when
there was another NODE_LOCAL task that could have been scheduled.
This happens because TaskSchedulerImpl calls TaskSetManager.resourceOffer
(which in turn calls TaskSetManager.findTask) with increasing locality levels,
beginning with PROCESS_LOCAL, followed by NODE_LOCAL, and so on until the
highest currently allowed level. Now, supposed NODE_LOCAL is the highest
currently allowed locality level. The first time findTask is called, it will be
called with max level PROCESS_LOCAL; if it cannot find any PROCESS_LOCAL tasks,
it will try to schedule tasks with no locality preferences or speculative
tasks. As a result, speculative tasks or tasks with no preferences may be
scheduled instead of NODE_LOCAL tasks.
----
I added an additional parameter in resourceOffer and findTask, maxLocality,
indicating when we should consider the tasks without locality preference
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/CodingCat/spark SPARK-2294
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/1313.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1313
----
commit 35524413287990734685125ec02eb8dd58f97b12
Author: CodingCat <[email protected]>
Date: 2014-07-07T04:37:06Z
fix locality inversion bug in TaskManager
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---