holdenk commented on a change in pull request #28817:
URL: https://github.com/apache/spark/pull/28817#discussion_r440404339
##########
File path:
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
##########
@@ -258,26 +262,60 @@ private[spark] class CoarseGrainedExecutorBackend(
System.exit(code)
}
- private def decommissionSelf(): Boolean = {
- logInfo("Decommissioning self w/sync")
- try {
- decommissioned = true
- // Tell master we are are decommissioned so it stops trying to schedule
us
- if (driver.nonEmpty) {
- driver.get.askSync[Boolean](DecommissionExecutor(executorId))
+ private def shutdownIfDone(): Unit = {
+ val numRunningTasks = executor.numRunningTasks
+ logInfo(s"Checking to see if we can shutdown have ${numRunningTasks}
running tasks.")
+ if (executor.numRunningTasks == 0) {
+ if (env.conf.get(STORAGE_DECOMMISSION_ENABLED)) {
+ val allBlocksMigrated = env.blockManager.decommissionManager match {
+ case Some(m) => m.allBlocksMigrated
+ case None => false // We haven't started migrations yet.
+ }
+ if (allBlocksMigrated) {
+ logInfo("No running tasks, all blocks migrated, stopping.")
+ exitExecutor(0, "Finished decommissioning", notifyDriver = true)
Review comment:
Can you point to where in TaskSchedulerImpl it's going to fail the job?
`core/src/main/scala/org/apache/spark/deploy/master/Master.scala` is where the
current code is, but there might be an additional case that needs to be covered.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]