Github user sitalkedia commented on the issue:

    https://github.com/apache/spark/pull/19048
  
    On a high level I agree that keeping the states in 3 places is creating a 
mess but changing that would require a big refactoring which is probably 
outside of the scope of this change.
    
    >> Or maybe you can reach the same thing through other means. For example, 
maybe if you get rid of the replace argument and make killExecutors not update 
the CGSB target count, and then force the caller to call requestTotalExecutors 
before killing executors, you could achieve the same thing. Maybe there are 
corner cases doing that, but maybe it works?
    
    That might work. But there is a race-condition in doing that.  In order to 
do that, we need to have a `getTotalExecutors` api to CGSB and the caller needs 
to use the api as follows
    
    `val executors = getTotalExecutors();
    requestTotalExecutors(executors - 1)
    killExecutors(x)`
    
    It is possible that the total executors value changed by another thread 
between `getTotalExecutors` and `requestTotalExecutors` call. Setting aside the 
above potential race condition, I do not personally like that the caller need 
to call three apis which could have been done in just one.
    
    Instead of doing that, how about we add a new api to 
`ExecutorAllocationClient` which looks like this  - 
killExecutors(List<Executors>, Int requestedTotal) - This will be used only by 
the EAM to kill a specific executor and also set the total exected count in 
CGSB. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to