Re: default parallelism in trunk

Aaron Davidson Sun, 02 Feb 2014 16:48:28 -0800

Could you give an example where default parallelism is set to 2 where it
didn't used to be?

Here is the relevant section for the spark standalone mode:
CoarseGrainedSchedulerBackend.scala#L211<https://github.com/apache/incubator-spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L211>.
If spark.default.parallelism is set, it will override anything else. If it
is not set, we will use the total number of cores in the cluster and 2,
which is the same logic that has been used since
spark-0.7<https://github.com/apache/incubator-spark/blob/branch-0.7/core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala#L156>
.

Simplest possibility is that you're setting spark.default.parallelism,
otherwise there may be a bug introduced somewhere that isn't defaulting
correctly anymore.

On Sat, Feb 1, 2014 at 12:30 AM, Koert Kuipers <ko...@tresata.com> wrote:

> i just managed to upgrade my 0.9-SNAPSHOT from the last scala 2.9.x
> version to the latest.
>
>
> everything seems good except that my default parallelism is now set to 2
> for jobs instead of some smart number based on the number of cores (i think
> that is what it used to do). it this change on purpose?
>
> i am running spark standalone.
>
> thx, koert
>

Re: default parallelism in trunk

Reply via email to