GitHub user mateiz opened a pull request:

    https://github.com/apache/spark/pull/8844

    [SPARK-9852] Let reduce tasks fetch multiple map output partitions

    This makes two changes:
    
    - Allow reduce tasks to fetch multiple map output partitions -- this is a 
pretty small change to HashShuffleFetcher
    - Move shuffle locality computation out of DAGScheduler and into 
ShuffledRDD / MapOutputTracker; this was needed because the code in 
DAGScheduler wouldn't work for RDDs that fetch multiple map output partitions 
from each reduce task
    
    I also added an AdaptiveSchedulingSuite that creates RDDs depending on 
multiple map output partitions.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mateiz/spark spark-9852

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8844.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8844
    
----
commit cbf6a5a78c419b8bb26e6259849f1c35a2d31edb
Author: Matei Zaharia <[email protected]>
Date:   2015-08-13T23:35:49Z

    Allow HashShuffleReader to fetch multiple partitions

commit 8f42d5c036b9b0985c89cdcf43f66f9f5eec6f3f
Author: Matei Zaharia <[email protected]>
Date:   2015-08-20T22:13:54Z

    Compute reduce locality only for ShuffledRDD and its SQL counterpart

commit e4a6f5f547788d03c1e0a373fc2f091571c1d12b
Author: Matei Zaharia <[email protected]>
Date:   2015-08-20T22:23:05Z

    More testing

commit 9ab02f17bae711dbd2e3979f8e64863fb84cbd81
Author: Matei Zaharia <[email protected]>
Date:   2015-09-20T16:09:07Z

    Fix compile

commit f4d2519bc3467d595dcd293a2b54157664222630
Author: Matei Zaharia <[email protected]>
Date:   2015-09-20T22:24:34Z

    Some test fixes

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to