Github user tgravescs commented on the pull request:
https://github.com/apache/spark/pull/10951#issuecomment-181382033
hey @andrewor14. Yes these changes fix the issue. Its really easy to
reproduce if you want to test it out yourself just check the instructions in
the jira. dynamic allocation is pretty broken right now with speculation on.
At least if you want it to give executors back. Here I changed what was
necessary (kind of the minimum) to make it work, all the tests I ran passed. I
ran many times because its all timing dependent on the order in which the
events come in and whether speculative tasks finish before original, etc..
Honestly the speculation code needs some cleanup and rework. I am going to
file another jira for that though.
In my opinion, you can try to make things happen in the right order as much
as possible, but this is a distributed system and you have to be able to handle
things coming out of order.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]