Github user pwendell commented on a diff in the pull request:
https://github.com/apache/spark/pull/686#discussion_r14211376
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -1062,10 +1062,15 @@ class DAGScheduler(
// This is the only job that uses this stage, so fail the stage
if it is running.
val stage = stageIdToStage(stageId)
if (runningStages.contains(stage)) {
- taskScheduler.cancelTasks(stageId, shouldInterruptThread)
- val stageInfo = stageToInfos(stage)
- stageInfo.stageFailed(failureReason)
-
listenerBus.post(SparkListenerStageCompleted(stageToInfos(stage)))
+ try { // cancelTasks will fail if a SchedulerBackend does not
implement killTask
+ taskScheduler.cancelTasks(stageId, shouldInterruptThread)
+ val stageInfo = stageToInfos(stage)
+ stageInfo.stageFailed(failureReason)
+
listenerBus.post(SparkListenerStageCompleted(stageToInfos(stage)))
+ } catch {
+ case e: UnsupportedOperationException =>
+ logInfo(s"Could not cancel tasks for stage $stageId", e)
+ }
--- End diff --
I think the correct semantics here are that we only trigger `jobFailed` if
the job was succesfully halted. This is mostly consumed downstream by tests,
but it's also used by code supporting asynchronous actions and approximate
results. In those cases, it would be better not to notify `jobFailed` if the
cancellation doesn't succeed, because they both assume that the hob has
finished executing if that message is received.
Separately we should probably update the documentation in cancel to explain
that it is a "best effort" method and will only be called if supported by the
underlying scheduler. Otherwise, we should say it will act as a no-op, i.e. it
will act is if `cancel` was never called. With this approach downstream
consumers will only have two cases to worry about (a job was cancelled or it
wasn't) rather than a third case, where we say it was cancelled but it
secretely actually wasn't.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---