GitHub user jerryshao opened a pull request:
https://github.com/apache/spark/pull/3438
[SPARK-2926][Shuffle]Add MR style sort-merge shuffle read for Spark
sort-based shuffle
This is a joint work with @sryza, Details and performance test report can
be seen in [SPARK-2926](https://issues.apache.org/jira/browse/SPARK-2926).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jerryshao/apache-spark
sort-shuffle-read-new-netty
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/3438.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3438
----
commit 32b4e6701ac89b5fb9932c9954a1f922bbdffb64
Author: jerryshao <[email protected]>
Date: 2014-09-05T08:46:12Z
initial commit of sort-merge shuffle reader
Conflicts:
core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala
core/src/main/scala/org/apache/spark/storage/BlockManager.scala
core/src/test/scala/org/apache/spark/storage/BlockFetcherIteratorSuite.scala
Conflicts:
core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala
commit 54dd3019ffc35c4d3f48b3dc90e038fb0611300d
Author: Sandy Ryza <[email protected]>
Date: 2014-10-22T19:49:35Z
Readability improvements to SortShuffleReader
commit 85b59115d8e52011d75ab6350934c73edc43060f
Author: Sandy Ryza <[email protected]>
Date: 2014-10-23T05:12:53Z
Clarify mergeWidth logic
commit 00300970ee6d60646ec929bd9fbe5b448f5dfd53
Author: Sandy Ryza <[email protected]>
Date: 2014-10-23T17:09:49Z
Add blocks remaining at level counter back in
commit 156725cfe8fdba686ebd9704bdf9dfd906b048a1
Author: Sandy Ryza <[email protected]>
Date: 2014-10-24T22:30:07Z
Small fix
commit afb2233dc0a92e6e971e2159412769c8941d6c26
Author: Sandy Ryza <[email protected]>
Date: 2014-10-25T22:45:20Z
Move merge to a separate class and use a priority queue instead of levels
commit c822ac248ee158aa312340b86e74b29cee19f775
Author: jerryshao <[email protected]>
Date: 2014-10-30T05:00:02Z
Rebase to the latest code and fix some conflicts
commit 1364e3614ce7c6db723d0d5418aaad96944823ee
Author: jerryshao <[email protected]>
Date: 2014-11-04T07:11:29Z
SortShuffleReader code improvement
commit 4c327de8e32bc77b8aff55cfd0e05ceb29c114aa
Author: Sandy Ryza <[email protected]>
Date: 2014-11-05T07:30:46Z
Don't spill more blocks than we need to
commit 7ba4b15a86d9e6a17f2275fe8119ae76fb2e37e0
Author: Sandy Ryza <[email protected]>
Date: 2014-11-05T08:57:16Z
Fix bug: add to inMemoryBlocks
commit d3c4282fcda763124fd4e2b11e013389e4087847
Author: jerryshao <[email protected]>
Date: 2014-11-05T01:10:14Z
Changes to rebase to the latest master branch
commit 2123605c0cc8e69d266115741f92d305e531360d
Author: Sandy Ryza <[email protected]>
Date: 2014-11-05T18:39:12Z
Fix another bug
commit 624b0a050c99be472fb320acb8a5b5e61ce29813
Author: jerryshao <[email protected]>
Date: 2014-11-05T09:22:20Z
Bug fix and revert ShuffleMemoryManager
commit d2da7e89e92f224c342120a82ee5824cba35942a
Author: jerryshao <[email protected]>
Date: 2014-11-07T03:02:14Z
Fix some bugs in spilling to disk
commit 858ac7b6d00ed1659ec765cf31b2c0ddc6102a5d
Author: jerryshao <[email protected]>
Date: 2014-11-10T05:10:21Z
Modify to use BlockObjectWriter to write data
commit acad11b464624d08c4578e44915f73e4d0c505db
Author: jerryshao <[email protected]>
Date: 2014-11-11T05:13:25Z
Fix incorrect block size introduced bugs
commit f55a1efdfd9da1b46a2bf831a55ba22c15f700bb
Author: jerryshao <[email protected]>
Date: 2014-11-12T08:23:47Z
Address the comments
commit 5339ceb50717b595cf6f89b6f13c4bb3f3b93422
Author: jerryshao <[email protected]>
Date: 2014-11-12T13:17:04Z
Fix some bugs
commit 83f7d554ca65ba3ed5b773593abb8a6183742c23
Author: jerryshao <[email protected]>
Date: 2014-11-14T14:11:16Z
Improve the failure process and expand ManagedBuffer
commit 17b36d308e05ae1f72fc60a2151ae79e754f1bd4
Author: jerryshao <[email protected]>
Date: 2014-11-17T05:07:32Z
Copy the memory from off-heap to on-heap and some code style modification
commit 34865d2b1212b893b6630cd775ed0e3dc49a54c4
Author: jerryshao <[email protected]>
Date: 2014-11-18T07:19:30Z
Fix rebase introduced issue
commit a3da81cffb5e6f7e6f9df76d19f2314a14386344
Author: jerryshao <[email protected]>
Date: 2014-11-18T07:27:10Z
Revert some unwanted changes
commit 106800553b51b2a7609d3c6fb819d57f2eb183be
Author: Sandy Ryza <[email protected]>
Date: 2014-11-24T08:23:03Z
Clean up comments, break up large methods, spill based on actual block
size, and properly increment _diskBytesSpilled
commit bfc2614f05f453bc052090d08a0dd12d19a9c2d5
Author: jerryshao <[email protected]>
Date: 2014-11-25T01:15:58Z
Log improve
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]