[
https://issues.apache.org/jira/browse/SPARK-21219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Vandenberg updated SPARK-21219:
------------------------------------
Attachment: spark_executor.log.anon
spark_driver.log.anon
> Task retry occurs on same executor due to race condition with blacklisting
> --------------------------------------------------------------------------
>
> Key: SPARK-21219
> URL: https://issues.apache.org/jira/browse/SPARK-21219
> Project: Spark
> Issue Type: Bug
> Components: Scheduler
> Affects Versions: 2.1.1
> Reporter: Eric Vandenberg
> Priority: Minor
> Attachments: spark_driver.log.anon, spark_executor.log.anon
>
>
> When a task fails it is added into the pending task list and corresponding
> black list policy is enforced (ie, specifying if it can/can't run on a
> particular node/executor/etc.) Unfortunately the ordering is such that
> retrying the task could assign the task to the same executor, which,
> incidentally could be shutting down and immediately fail the retry. Instead
> the black list state should be updated and then the task assigned, ensuring
> that the black list policy is properly enforced.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]