warrenzhu25 commented on PR #39280:
URL: https://github.com/apache/spark/pull/39280#issuecomment-1367694180

   > > > May I ask what is the benefit of handling an executor who did nothing 
yet, @warrenzhu25 ?
   > > > > Handle decommission request sent before executor registration
   > > 
   > > 
   > > When decom request is triggered from worker, executor should be 
decommissioned and not accept new tasks, but if scheduler backend got decom 
request before executor registration, the executor won't be decommissioned.
   > 
   > Could you elaborate about the real situation you hit? If the spot 
instances or K8s knodes are in the process of terminations, it seems that we 
are going to lose those underlying machines anyway, doesn't it?
   
   Image the below senario:
   1. Executor 1 launched but not resigstered.
   2. Node got decom signal, request to driver to decom executor 1. Node will 
terminate in 1 min.
   3. If decom request is ignored by driver, driver will schedule new tasks on 
executor 1. After the fix, no new task will be scheduled after executor 
registration.
   4. 1 mins later, all running tasks on executor 1 will be lost.
   
   In our senario, node will be terminated only after all executor are exit. 
When decom request is missed, executor won't exit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to