SuYan created SPARK-4167:
----------------------------
Summary: Schedule task on Executor will be Imbalance while task
run less than local-wait time
Key: SPARK-4167
URL: https://issues.apache.org/jira/browse/SPARK-4167
Project: Spark
Issue Type: Improvement
Components: Spark Core
Affects Versions: 1.1.0
Reporter: SuYan
Recently, when run a spark on yarn job. it occurs executor schedules imbalance.
the procedure is that:
1. because user's mistake, the spark on yarn job's input split contains 0 byte
empty splits.
1.1: task0-99 , no-preference task(0 byte) task100-800, node-local task 1.2:
user will run task 500 loops
1.3: 60 executor
2.
executor A only have 2 node-local task in the first loop, executor A first
finished node-local-task, the it will run no-preference task, and the
no-preference task in our situation have smaller input split than node-local
task. So executor A finished all no-reference task, while others still run
node-local job.
in the second loop, all task have process-local level, and all task finished in
3 seconds, so while executor A is still run process-local task while others are
all finished process-local task. but all process-task run by executor A will
finished in 3 seconds, so the local level will always be process-local.
it results other executors are all wait for executor A the same situation in
the left loops.
To solve this situation, we let user to delete the empty input split.
but is still have implied imbalance, while in some loops, a executor got more
process-local task than others in one loop, and this task all less-3 seconds
task. and then in the left loops, the others executor will wait that executor
to finished all process-local tasks.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]