Github user sitalkedia commented on the issue:
https://github.com/apache/spark/pull/19048
On a high level I agree that keeping the states in 3 places is creating a
mess but changing that would require a big refactoring which is probably
outside of the scope of this change.
>> Or maybe you can reach the same thing through other means. For example,
maybe if you get rid of the replace argument and make killExecutors not update
the CGSB target count, and then force the caller to call requestTotalExecutors
before killing executors, you could achieve the same thing. Maybe there are
corner cases doing that, but maybe it works?
That might work. But there is a race-condition in doing that. In order to
do that, we need to have a `getTotalExecutors` api to CGSB and the caller needs
to use the api as follows
`val executors = getTotalExecutors();
requestTotalExecutors(executors - 1)
killExecutors(x)`
It is possible that the total executors value changed by another thread
between `getTotalExecutors` and `requestTotalExecutors` call. Setting aside the
above potential race condition, I do not personally like that the caller need
to call three apis which could have been done in just one.
Instead of doing that, how about we add a new api to
`ExecutorAllocationClient` which looks like this -
killExecutors(List<Executors>, Int requestedTotal) - This will be used only by
the EAM to kill a specific executor and also set the total exected count in
CGSB.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]