GitHub user scwf opened a pull request:

    https://github.com/apache/spark/pull/3694

    [Core] Adding a parallelismRatio to control the partitions num of 
shuffledRDD

    Adding parallelismRatio to control the partitions num of shuffledRDD, the 
rule is:
      
         Math.max(1, parallelismRatio * number of partitions of the largest 
upstream RDD)
      
    The ratio is 1.0 by default to make it compatible with the old version. 
When we have a good experience on it, we can change this.  

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/scwf/spark parallismRatio

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3694.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3694
    
----
commit e6c43ecdf4e49ea73befea8b87fb2a47eec2fb37
Author: wangfei <[email protected]>
Date:   2014-12-14T09:25:17Z

    adding spark.default.parallelismRatio

commit 63826ae63bb1f912a6000f0cd958c44579960c1e
Author: wangfei <[email protected]>
Date:   2014-12-14T09:31:58Z

    minor fix

commit a71ce3b92a3f49f8035fa14b4249775087203af5
Author: wangfei <[email protected]>
Date:   2014-12-15T01:01:13Z

    minor fix

commit f21bfd4904fa340099d190bd3963fefc79f0faa4
Author: wangfei <[email protected]>
Date:   2014-12-15T01:11:15Z

    minor fix

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to