Kay Ousterhout created SPARK-4383:
-------------------------------------
Summary: Delay scheduling doesn't work right when jobs have tasks
with different locality levels
Key: SPARK-4383
URL: https://issues.apache.org/jira/browse/SPARK-4383
Project: Spark
Issue Type: Bug
Affects Versions: 1.1.0, 1.0.2
Reporter: Kay Ousterhout
Copied from mailing list discussion:
Now our application will load data from hdfs in the same spark cluster, it will
get NODE_LOCAL and RACK_LOCAL level tasks during loading stage, if the tasks in
loading stage have same locality level, ether NODE_LOCAL or RACK_LOCAL it works
fine.
But if the tasks in loading stage get mixed locality level, such as 3
NODE_LOCAL tasks, and 2 RACK_LOCAL tasks, then the TaskSetManager of loading
stage will submit the 3 NODE_LOCAL tasks as soon as resources were offered,
then wait for spark.locality.wait.node, which was set to 30 minutes, the 2
RACK_LOCAL tasks will wait 30 minutes even though resources are available.
Fixing this is quite tricky -- do we need to track the locality level
individually for each task?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]