Github user sryza commented on the pull request:
https://github.com/apache/spark/pull/5536#issuecomment-94945086
Thanks Andrew, I think you've nailed the part of the problem that I was
missing: when executors die or are killed, executorIds.size decreases. If
numPendingExecutors is negative at this point, this can lead targetNumExecutors
to go below 0.
My opinion is that it's best not to think in terms of adding and canceling
requests, but to think in terms of expressing a preference about the total
number of executors. I believe that it's much easier to reason about
invariants around this number than it is to reason about sequences of add,
cancel, and remove events. Also, canceling isn't entirely accurate because the
YarnAllocator doesn't ignore requests for fewer total executors than the
current number allocated. It registers the new number as its target number of
executors, which affects the behavior when an executor dies or is killed.
What we ultimately care about is that targetNumExecutors should never go
below 0. We only care about numExecutorsPending going below 0 to the extent
that it causes the former to occur.
I think we can fully specify the behavior of targetNumExecutors in a
reasonable way:
* targetNumExecutors should reflect the number of executors YarnAllocator
should be willing to submit requests for. I.e. if all our executors were to
suddenly die, how many requests would YarnAllocator submit to YARN.
* targetNumExecutors can change in two ways:
** When the maximum number of executors needed (based on current running +
pending tasks) is below the current target, we decrease the current target to
equal it.
** When the maximum number of executors needed is above the current target,
we increase the current target according to the exponential add policy.
With this view, we can separate killing executors from the
targetNumExecutors calculation entirely. When targetNumExecutors is less than
executorIds.size for a given period of time, we issue kill requests. These
kill requests do not affect targetNumExecutors, neither while the kill is
pending nor after the kill has gone through.
I'm pretty convinced that this approach is the most straightforward and
easiest to reason about. It requires slightly modifying the logic for removing
executors, but I think in a way that's in line with its original goals. We
also get rid of numExecutorsPending and track targetNumExecutors directly. If
this seems reasonable to you Andrew, I'll put up a patch that makes the
required changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]