Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/5536#issuecomment-94592575
I believe this is the right fix, but it's actually fairly non-trivial to
determine the correctness of this change. When we request a new total that is
less than the current total, `YarnAllocator` will lower the number of pending
executor requests but *not kill existing executors* to get to that total. Thus,
there is a divergence in the total expected by `ExecutorAllocationManager` and
the `YarnAllocator`, and the `ExecutorAllocationManager` uses this new
incorrect total to compute the number of pending executors, leading to a
negative value.
A simple of example: We have 10 executors with nothing pending to be added
or removed. Then suddenly we finish running all tasks, such that the number of
executors needed is now 0. Then, when we try to cancel, we will request a total
of 0 executors. If we follow the math in `updateNumPendingExecutors`, we will
get -10:
```
// 0 - 10 + 0 = -10
numExecutorsPending =
newTotalExecutors - executorIds.size + executorsPendingToRemove.size
```
which doesn't make sense because we can't have a negative number of pending
executors.
Why is this a correct fix then? Well, the only place where we ever request
a total less than the current one is when we cancel outstanding requests, and
when we cancel requests the `YarnAllocator` will only at most cancel all
pending requests, but not kill existing executors. Limiting the
`numPendingExecutors` to be at least 0 essentially causes the same behavior on
the `ExecutorAllocationManager` side.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]