Repository: spark Updated Branches: refs/heads/master c5db8e2c0 -> c6f4e7042
SPARK-4230. Doc for spark.default.parallelism is incorrect Author: Sandy Ryza <[email protected]> Closes #3107 from sryza/sandy-spark-4230 and squashes the following commits: 37a1d19 [Sandy Ryza] Clear up a couple things 34d53de [Sandy Ryza] SPARK-4230. Doc for spark.default.parallelism is incorrect Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c6f4e704 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c6f4e704 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c6f4e704 Branch: refs/heads/master Commit: c6f4e704214097f17d2d6abfbfef4bb208e4339f Parents: c5db8e2 Author: Sandy Ryza <[email protected]> Authored: Mon Nov 10 12:40:41 2014 -0800 Committer: Patrick Wendell <[email protected]> Committed: Mon Nov 10 12:40:41 2014 -0800 ---------------------------------------------------------------------- docs/configuration.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/c6f4e704/docs/configuration.md ---------------------------------------------------------------------- diff --git a/docs/configuration.md b/docs/configuration.md index 0f9eb81..f0b396e 100644 --- a/docs/configuration.md +++ b/docs/configuration.md @@ -562,6 +562,9 @@ Apart from these, the following properties are also available, and may be useful <tr> <td><code>spark.default.parallelism</code></td> <td> + For distributed shuffle operations like <code>reduceByKey</code> and <code>join</code>, the + largest number of partitions in a parent RDD. For operations like <code>parallelize</code> + with no parent RDDs, it depends on the cluster manager: <ul> <li>Local mode: number of cores on the local machine</li> <li>Mesos fine grained mode: 8</li> @@ -569,8 +572,8 @@ Apart from these, the following properties are also available, and may be useful </ul> </td> <td> - Default number of tasks to use across the cluster for distributed shuffle operations - (<code>groupByKey</code>, <code>reduceByKey</code>, etc) when not set by user. + Default number of partitions in RDDs returned by transformations like <code>join</code>, + <code>reduceByKey</code>, and <code>parallelize</code> when not set by user. </td> </tr> <tr> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
