GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/8829
[SPARK-10708] [WIP] Consolidate sort shuffle implementations
There's a lot of duplication between SortShuffleManager and
UnsafeShuffleManager. Given that these now provide the same set of
functionality, now that UnsafeShuffleManager supports large records, I think
that we should replace SortShuffleManager's serialized shuffle implementation
with UnsafeShuffleManager's and should merge the two managers together.
This is temporarily rebased on top of #8825. There's still many
documentation and configuration updates that need to be performed.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark
consolidate-sort-shuffle-implementations
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/8829.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8829
----
commit ad532e49b0093c8f6529ede09d794c975fb1ce69
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T17:44:50Z
Rename HashShuffleReader to ShuffleReader.
commit 90e01c09ea79c7c9c0006878434a32c3d568bc6b
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T18:09:30Z
Change getReader() to only accept a single partition:
There was no ShuffleReader implementation that supported the ability to
fetch
a range of partitions. If we do want to implement support for fetching
multiple
partitions then we'll likely want to support the ability to fetch sets of
partitions with non-contiguous ids, which this interface doesn't support.
commit a0e7fc6aad69fe22602700d9f6f74716b15d7d86
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T18:48:07Z
Consolidate getReader() implementations.
commit 36b25e347f0573d96a5940ead837e02a10e5ae08
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T20:13:35Z
Fix MiMa.
commit 43eb0cf2e1ee242724c6f329b8e0ce64b419b806
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T20:53:36Z
Merge remote-tracking branch 'origin/master' into shuffle-reader-cleanup
commit 803f62f0ca51f221ee5a7ef28e4def0bc8f40b4b
Author: Josh Rosen <[email protected]>
Date: 2015-09-18T19:59:26Z
WIP towards consolidation of sort shuffle implementations.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]