Github user squito commented on the pull request:
https://github.com/apache/spark/pull/4155#issuecomment-71883720
I worry about how complicated the test is, and how much it needs to muck
around with internals ... it may be hard to keep up to date as those internals
change. And, I'm not sure it actually tests the behavior you are trying to
verify. I just grabbed only `OutputCommitCoordinatorSuite` &
`DAGSchedulerSingleThreadedProcessLoop`, without any of the other changes --
and the tests pass. (which is not to say the test is useless.)
Unfortunately, I don't have any good ideas for a better test. I've been
thinking about it for a while, and still haven't come up with anything. You
could run a speculative job which writes to hadoop and see if it works, though
that could easily pass even if this didn't always work. Maybe
`OutputCommitCoordinator` needs some special hooks for testing. I'll keep
thinking about it, but wanted to get others thoughts.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]