GitHub user 10110346 opened a pull request:
https://github.com/apache/spark/pull/22774
[SPARK-25780][CORE]Scheduling the tasks which have no higher level locality
first
## What changes were proposed in this pull request?
For example:
An application has two executors: (exe1, host1), (exe2,host2), and 3 tasks
with locality: {task0, Seq(TaskLocation("host1", "exec1"))}, {task1,
Seq(TaskLocation("host1", "exec1"), TaskLocation("host2"))}, {task2,
Seq(TaskLocation("host2")}
If task0 is runing in exe1, when `allowedLocality` is NODE_LOCAL for exe2,
it is better to schedule task2 fisrt, not task1, because task1 may be scheduled
to exe1 later.
## How was this patch tested?
Added a UT
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/10110346/spark higher_locality
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22774.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22774
----
commit 7076bdef5c633739a17e6e9f7ed0c80ed5cb11de
Author: liuxian <liu.xian3@...>
Date: 2018-10-19T06:36:30Z
fix
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]