tgravescs commented on a change in pull request #27050: [SPARK-30388][Core] Mark running map stages of finished job as finished, and cancel running tasks URL: https://github.com/apache/spark/pull/27050#discussion_r366602651
########## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ########## @@ -1431,21 +1434,28 @@ private[spark] class DAGScheduler( // If the whole job has finished, remove it if (job.numFinished == job.numPartitions) { markStageAsFinished(resultStage) - cleanupStateForJobAndIndependentStages(job) - try { - // killAllTaskAttempts will fail if a SchedulerBackend does not implement - // killTask. - logInfo(s"Job ${job.jobId} is finished. Cancelling potential speculative " + - "or zombie tasks for this job") - // ResultStage is only used by this job. It's safe to kill speculative or - // zombie tasks in this stage. - taskScheduler.killAllTaskAttempts( - stageId, - shouldInterruptTaskThread(job), - reason = "Stage finished") - } catch { - case e: UnsupportedOperationException => - logWarning(s"Could not cancel tasks for stage $stageId", e) + val removedStages = cleanupStateForJobAndIndependentStages(job) Review comment: there are other places cleanupStateForJobAndIndependentStages is called, this isn't handling this same thing. I'm wondering if the markStageAsFinished and canceltasks shouldn't be done inside the cleanupStatgeForJobAndIndependentStages. It would be nice to investigate that a little further and have tests to verify. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org