itskals commented on a change in pull request #26440: [SPARK-20628][CORE][K8S] 
Start to improve Spark decommissioning & preemption support
URL: https://github.com/apache/spark/pull/26440#discussion_r376886127
 
 

 ##########
 File path: 
core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
 ##########
 @@ -140,6 +144,16 @@ private[spark] class CoarseGrainedExecutorBackend(
       if (executor == null) {
         exitExecutor(1, "Received LaunchTask command but executor was null")
       } else {
+        if (decommissioned) {
+          logError("Asked to launch a task while decommissioned.")
+          driver match {
+            case Some(endpoint) =>
 
 Review comment:
   
https://docs.google.com/document/d/1xVO1b6KAwdUhjEJBolVPl9C6sLj7oOveErwDSYdT-pE/edit?disco=AAAAI73a0FM
   I have marked a comment on this in design doc as well. I think it can be 
handled by the driver by not allocating tasks to the executor at the first 
place.
   When driver is aware of the possible decommission of the node, it can stop 
allocating tasks to this executor. A small code change in the driver's 
org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.DriverEndpoint#makeOffers
 
   ```
          // Filter out executors on decommissioning worker
           val activeExecutors = executorDataMap.filterKeys(isExecutorActive)
                                    .filter(x => 
!isNodeDecommissioning(x._2.executorHost))
   ```
   For this there can be DecommissionTracker ( in the same lines as 
BlacklistTracker). The DT is filled when the driver is informed of the 
decommissioning is informed on the host. As comment on the design I have tried 
to elaborate the flow of populating DT.
   ```
     private def isNodeDecommissioning(hostname: String): Boolean = {
       decommissionTracker match {
         case None => return false
         case Some(decommissionTracker) => return 
decommissionTracker.isNodeDecommissioning(hostname)
       }
     }
   }
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to