Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/3550#issuecomment-65269257
It's worth spending a little time checking that any executors that are
RUNNING for an application definitely will transition to a Finished state and
be removed from the master's accounting if the application dies. If we are
certain that all the running executors will finish after application death and
that repeatedly failing executors from a bad node while a running executor
remains on master's books will not progressively consume resources, then I
think this PR solves the problems. The only sort-of negative that I am seeing
is that there can be an arbitrarily large number of failed executor launch
attempts while at least one executor remains running, which will at least fill
up error logs; but that is arguably not an all bad thing and is something whose
proper resolution can be better handled (at least for now) by a system
administrator than by an attempt to automate resolution.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]