Github user mridulm commented on a diff in the pull request:
https://github.com/apache/spark/pull/21653#discussion_r200805005
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -723,6 +723,13 @@ private[spark] class TaskSetManager(
def handleSuccessfulTask(tid: Long, result: DirectTaskResult[_]): Unit =
{
val info = taskInfos(tid)
val index = info.index
+ // Check if any other attempt succeeded before this and this attempt
has not been handled
+ if (successful(index) && killedByOtherAttempt(index)) {
--- End diff --
For completeness, we will also need to 'undo' the changes in
`enqueueSuccessfulTask` : to reverse the counters in `canFetchMoreResults`.
(Orthogonal to this PR): Looking at use of `killedByOtherAttempt`, I see
that there is a bug in `executorLost` w.r.t how it is updated - hopefully a fix
for SPARK-24755 wont cause issues here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]