GitHub user JoshRosen opened a pull request:

    https://github.com/apache/spark/pull/8089

    [SPARK-9808] Remove spark.shuffle.consolidateFiles and associated 
implementation.

    I think that we should remove `spark.shuffle.consolidateFiles` and its 
associated implementation for Spark 1.5.0. Rationale:
    
    - This feature is not properly tested. The sole test for this feature in 
`HashShuffleManagerSuite` does not appear to be testing the right thing because 
it never enables shuffle file consolidation.
    - The motivation for this feature, reducing the number of shuffle files 
created on disk, has been addressed by sort-based shuffle, which is better 
tested.
    - Shuffle file consolidation does not work with the external shuffle 
service.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/JoshRosen/spark 
remove-hash-shuffle-consolidation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/8089.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8089
    
----
commit f4fe48d1f711347d9a19c519c4979c6584274d4b
Author: Josh Rosen <[email protected]>
Date:   2015-08-11T01:25:13Z

    Remove spark.shuffle.consolidateFiles and associated implementation.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to