GitHub user hthuynh2 opened a pull request:
https://github.com/apache/spark/pull/21729
SPARK-24755 Executor loss can cause task to not be resubmitted
**Description**
As described in
[SPARK-24755](https://issues.apache.org/jira/browse/SPARK-24755), when
speculation is enabled, there is scenario that executor loss can cause task to
not be resubmitted.
This patch changes the variable killedByOtherAttempt to keeps track of the
taskId of tasks that are killed by other attempt. By doing this, we can still
prevent resubmitting task killed by other attempt while resubmit successful
attempt when executor lost.
**How was this patch tested?**
A UT is added based on the UT written by @xuanyuanking with modification to
simulate the scenario described in SPARK-24755.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hthuynh2/spark SPARK_24755
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21729.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21729
----
commit 093e39cf76378821284ef7d771e819afb69930ae
Author: Hieu Huynh <âhieu.huynh@...>
Date: 2018-07-08T18:20:26Z
SPARK-24755 Executor loss can cause task to not be resubmitted
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]