Re: [PR] [SPARK-46052][CORE] Remove function TaskScheduler.killAllTaskAttempts [spark]

via GitHub Tue, 28 Nov 2023 18:22:51 -0800


Ngone51 commented on code in PR #43954:
URL: https://github.com/apache/spark/pull/43954#discussion_r1408649384



##########
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala:
##########
@@ -296,18 +296,32 @@ private[spark] class TaskSchedulerImpl(
     new TaskSetManager(this, taskSet, maxTaskFailures, healthTrackerOpt, clock)
   }
 
+  // Kill all the tasks in all the stage attempts of the same stage Id. Note 
stage attempts won't
+  // be aborted but will be marked as zombie. The stage attempt will be 
finished and cleaned up
+  // once all the tasks has been finished. The stage attempt could be aborted 
after the call of

Review Comment:
   ```
   def abort(message: String, exception: Option[Throwable] = None): Unit = 
sched.synchronized {
     sched.dagScheduler.taskSetFailed(taskSet, message, exception)
     isZombie = true
     maybeFinishTaskSet()
   }
   ```
   
   When there is a call to abort, the TSM must be marked as zombie. So the key 
difference should come from `dagScheduler.taskSetFailed`. 
`dagScheduler.taskSetFailed` essentially cleans up the data related to this 
stage and fail the jobs which depends on this stage.
   
   
   
   There's no difference to TSM between zombie and abort. Tasks in TSM can 
still run until finishes (whether killed or succeeded).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-46052][CORE] Remove function TaskScheduler.killAllTaskAttempts [spark]

Reply via email to