Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/9796#issuecomment-159719315
@GraceH continuing our discussion on the "pending replacement" test:
The problem is actually how the framework is set up. To speed up the test
we don't wait for new executors to register, so actually this test was
incorrectly written even before this patch. This is because:
- First we start with 2 executors, let's say {27, 28}
- Then we kill and replace 1. What we *should* end up with is {28, 29}, but
in this test we don't wait for executor 29 to come up, so we still have {27,
28}.
- Then we try to kill 27 again, and this fails and that's why
`sc.killExecutor(27)` was false.
The right fix would be to add an `eventually` block in the test to wait
until we have 2 executors but with different IDs. We can do this by comparing
`executorIdsBefore != executorIdsAfter` before and after the call to
`sc.killAndReplace`. Even then some of the following test logic has to change.
Fixing this test may be fairly involved. Will you have time to look into
this? If not, I can take over later after the 1.6 release.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]