[
https://issues.apache.org/jira/browse/SPARK-823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985880#comment-13985880
]
Diana Carroll commented on SPARK-823:
-------------------------------------
Yes, please clarify the documentation, I just ran into this. the Configuration
guide (http://spark.apache.org/docs/latest/configuration.html) says the default
is 8.
In testing this on Standalone Spark, there actually is no default value for the
variable:
>sc.getConf.contains("spark.default.parallelism")
>res1: Boolean = false
It looks like if the variable is not set, then the default behavior is decided
in code, e.g. Partitioner.scala:
{code}
if (rdd.context.conf.contains("spark.default.parallelism")) {
new HashPartitioner(rdd.context.defaultParallelism)
} else {
new HashPartitioner(bySize.head.partitions.size)
}
{code}
> spark.default.parallelism's default is inconsistent across scheduler backends
> -----------------------------------------------------------------------------
>
> Key: SPARK-823
> URL: https://issues.apache.org/jira/browse/SPARK-823
> Project: Spark
> Issue Type: Bug
> Components: Documentation, Spark Core
> Affects Versions: 0.8.0, 0.7.3
> Reporter: Josh Rosen
> Priority: Minor
>
> The [0.7.3 configuration
> guide|http://spark-project.org/docs/latest/configuration.html] says that
> {{spark.default.parallelism}}'s default is 8, but the default is actually
> max(totalCoreCount, 2) for the standalone scheduler backend, 8 for the Mesos
> scheduler, and {{threads}} for the local scheduler:
> https://github.com/mesos/spark/blob/v0.7.3/core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala#L157
> https://github.com/mesos/spark/blob/v0.7.3/core/src/main/scala/spark/scheduler/mesos/MesosSchedulerBackend.scala#L317
> https://github.com/mesos/spark/blob/v0.7.3/core/src/main/scala/spark/scheduler/local/LocalScheduler.scala#L150
> Should this be clarified in the documentation? Should the Mesos scheduler
> backend's default be revised?
--
This message was sent by Atlassian JIRA
(v6.2#6252)