zhouyejoe commented on pull request #35683:
URL: https://github.com/apache/spark/pull/35683#issuecomment-1075979415


   Not quite sure about what RM will respond to the AM's allocateResource call. 
Will the response include resources on nodes that are being 
decommissioned?IIUC, RM should not allocate new containers on those nodes after 
they get commissioned. Or RM will actually return to AM, that some of the nodes 
are getting decommissioned, where RM did allocate resources on those nodes for 
earlier allocateResource calls? If so, please add a comment in 
`handleNodesInDecommissioningState`. Thanks.
   Indeed we should also better handling the node decommission for push based 
shuffle, it won't trigger retry though as it will fall back. But this case also 
applies to ESS cases where they have the unmerged shuffle files, but it will 
trigger stage retry to regenerate the unmerged shuffle data in other nodes. 
   We did get bothered, during some NM updates that couldn't go with work 
preserving restart. We issued the decommission to a few nodes in batches, and 
waited until the shuffle data under tmp got cleaned up, to make sure the whole 
cluster NM updates didn't trigger retries. This does increase the ops overhead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to