Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/506#issuecomment-41195545
@mridulm this is not a very general solution yet, and can be very bad
consequences (e.g. when data are not cached in memory). If we want a more
reliable allReduce, we should probably look into some sort of shuffle
dependency that is not all to all (the main problem modeling this using shuffle
I see is having to send a bunch of 0s back to the driver for shuffle block size
estimation; we might be able to just use run-length-encoding to make that
transmission cheap).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---