mridulm commented on code in PR #35683:
URL: https://github.com/apache/spark/pull/35683#discussion_r859243236


##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala:
##########
@@ -422,6 +426,33 @@ private[yarn] class YarnAllocator(
     }
   }
 
+  private def handleNodesInDecommissioningState(allocateResponse: 
AllocateResponse): Unit = {
+    try {
+      // Some of the nodes are put in decommissioning state where RM did 
allocate
+      // resources on those nodes for earlier allocateResource calls, so 
notifying driver
+      // to put those executors in decommissioning state
+      allocateResponse.getUpdatedNodes.asScala.filter(_.getNodeState == 
NodeState.DECOMMISSIONING).
+        foreach(node => 
driverRef.send(DecommissionExecutorsOnHost(getHostAddress(node))))

Review Comment:
   This will send `DecommissionExecutorsOnHost` each time in 
`allocateResources` right (for the same set of hosts while they are in 
decomissioning state) ?
   If yes, keep track of nodes which we have already updated to driver with in 
a cache and avoid sending duplicate messages.
   



##########
resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala:
##########
@@ -422,6 +426,33 @@ private[yarn] class YarnAllocator(
     }
   }
 
+  private def handleNodesInDecommissioningState(allocateResponse: 
AllocateResponse): Unit = {
+    try {
+      // Some of the nodes are put in decommissioning state where RM did 
allocate
+      // resources on those nodes for earlier allocateResource calls, so 
notifying driver
+      // to put those executors in decommissioning state
+      allocateResponse.getUpdatedNodes.asScala.filter(_.getNodeState == 
NodeState.DECOMMISSIONING).
+        foreach(node => 
driverRef.send(DecommissionExecutorsOnHost(getHostAddress(node))))

Review Comment:
   This will send `DecommissionExecutorsOnHost` each time in 
`allocateResources` right (for the same set of hosts while they are in 
decomissioning state) ?
   If yes, we should keep track of nodes which we have already updated to 
driver with in a cache and avoid sending duplicate messages.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to