Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/7888#issuecomment-137541708
Hi @GraceH thanks for fixing this problem. I agree with the problem
statement and the root cause. However, there are two outstanding issues with
the solution:
(1) `ExecutorAllocationManager` currently assumes that all remove requests
will eventually be fulfilled. This assumption is no longer true as of this
patch if the executor turns out to be busy. In this case we need to somehow
remove it from `executorsPendingToRemove`. We can do this, for instance, when
we get a `TaskStart` and/or `TaskEnd` event from that executor.
(2) Currently we never set `force = true` right? I think we should set it
to true if the user explicitly calls `sc.killExecutor`. However, to distinguish
between that case and dynamic allocation, we'll need to add more
`private[spark]` methods to `ExecutorAllocationClient` that accept the `force`
flag. Then we'll have the `ExecutorAllocationManager` call the internal methods
and explicitly pass `force = false`.
What do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]