Github user aarondav commented on the pull request:
https://github.com/apache/spark/pull/242#issuecomment-39374474
No ideas here, though Iterable will probably make the implementation of
actual iterator-backed cogrouping significantly more difficult unless we always
store the result to disk like we do for shuffles (which has performance
implications). That might be good anyway, though, because otherwise we have to
keep some sort of stream running to satisfy the iterator at any point (as it
could be read well after the job finishes).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---