xuanyuanking commented on a change in pull request #24350: [SPARK-27348][Core] 
HeartbeatReceiver should remove lost executors from 
CoarseGrainedSchedulerBackend
URL: https://github.com/apache/spark/pull/24350#discussion_r276720394
 
 

 ##########
 File path: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
 ##########
 @@ -205,6 +207,13 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, 
clock: Clock)
             // Note: we want to get an executor back after expiring this one,
             // so do not simply call `sc.killExecutor` here (SPARK-8119)
             sc.killAndReplaceExecutor(executorId)
+            // In case of the executors which are not gracefully shut down, we 
should remove
+            // lost executors from CoarseGrainedSchedulerBackend manually here 
(SPARK-27348)
+            sc.schedulerBackend match {
+              case backend: CoarseGrainedSchedulerBackend =>
 
 Review comment:
   For all 3 kinds of SchedulerBackend, both `CoarseGrainedSchedulerBackend` 
and `StandaloneSchedulerBackend` can be handled here, the last one, 
`LocalSchedulerBackend`, is unnecessary because this PR wants to fix the 
problem of clearing lost executor depends on `onDisconnect` in distributed 
scheduler backend, while `LocalSchedulerBackend` has no such problem. Please 
correct me if I'm wrong.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to