Github user squito commented on the pull request:
https://github.com/apache/spark/pull/5964#issuecomment-104660767
@kayousterhout I debated doing that, but I kept them together b/c my test
case produces all four. So I wouldn't have passing tests unless I addressed
them all. (Though I could actually relax the criteria in the test so 3 & 4
were unnecessary.)
To put it another way: I'm happy to restructure this for merging. But, I
feel that reviewers should consider all of the issues *together* to properly
grasp what is going wrong in the existing implementation. Again, I'd like to
stress running the reproduction and seeing what is wrong (ideally adding a loop
and running it many times) , that is far more important than the diff for
reviewing this IMO.
Though maybe this also leads to another question for reviewers -- how do
you feel about that test? Its unlike our other unit tests, in that it doesn't
try to very narrowly recreate one issue. instead it simulates a workload, with
some randomization. The downside of that test is, its slow and you can't
easily tease apart the different issues into separate tests. But the upside is
that you actually get better coverage -- eg., I wouldn't have discovered issue
(2) without this.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]