Github user tgravescs commented on a diff in the pull request:
https://github.com/apache/spark/pull/11996#discussion_r58604142
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -620,6 +620,14 @@ private[spark] class TaskSetManager(
// Note: "result.value()" only deserializes the value when it's called
at the first time, so
// here "result.value()" just returns the value and won't block other
threads.
sched.dagScheduler.taskEnded(tasks(index), Success, result.value(),
result.accumUpdates, info)
+ // Kill other task attempts if any as the one attempt succeeded
+ for (attemptInfo <- taskAttempts(index) if attemptInfo.attemptNumber
!= info.attemptNumber
+ && attemptInfo.running) {
+ logInfo("Killing attempt " + attemptInfo.attemptNumber + " for task
" + attemptInfo.id +
+ " in stage " + taskSet.id + " (TID " + attemptInfo.taskId + ") on
" + attemptInfo.host +
+ " as the attempt " + info.attemptNumber + " succeeded on " +
info.host)
+ sched.backend.killTask(attemptInfo.taskId, attemptInfo.executorId,
true)
--- End diff --
I think it would be better to have a killTask call in the taskScheduler
(similar to cancelTask) rather then reaching in and getting the backend
directly.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]