[
https://issues.apache.org/jira/browse/SPARK-11460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kay Ousterhout resolved SPARK-11460.
------------------------------------
Resolution: Duplicate
> Locality waits should be based on task set creation time, not last launch time
> ------------------------------------------------------------------------------
>
> Key: SPARK-11460
> URL: https://issues.apache.org/jira/browse/SPARK-11460
> Project: Spark
> Issue Type: Bug
> Components: Scheduler
> Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0, 1.1.1, 1.2.0, 1.2.1, 1.2.2,
> 1.3.0, 1.3.1, 1.4.0, 1.4.1, 1.5.0, 1.5.1
> Environment: YARN
> Reporter: Shengyue Ji
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> Spark waits for spark.locality.waits period before going from RACK_LOCAL to
> ANY when selecting an executor for assignment. The timeout was essentially
> reset each time a new assignment is made.
> We were running Spark streaming on Kafka with a 10 second batch window on 32
> Kafka partitions with 16 executors. All executors were in the ANY group. At
> one point one RACK_LOCAL executor was added and all tasks were assigned to
> it. Each task took about 0.6 second to process, resetting the
> spark.locality.wait timeout (3000ms) repeatedly. This caused the whole
> process to under utilize resources and created an increasing backlog.
> spark.locality.wait should be based on the task set creation time, not last
> launch time so that after 3000ms of initial creation, all executors can get
> tasks assigned to them.
> We are specifying a zero timeout for now as a workaround to disable locality
> optimization.
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L556
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]