Re: Spark.default.parallelism can not set reduce number

Ovidiu-Cristian MARCU Fri, 20 May 2016 09:36:46 -0700

You can check org.apache.spark.sql.internal.SQLConf for other default settings 
as well.
  val SHUFFLE_PARTITIONS = SQLConfigBuilder("spark.sql.shuffle.partitions")
    .doc("The default number of partitions to use when shuffling data for joins 
or aggregations.")
    .intConf
    .createWithDefault(200)



> On 20 May 2016, at 13:17, 喜之郎 <251922...@qq.com> wrote:
> 
>  Hi all.
> I set Spark.default.parallelism equals 20 in spark-default.conf. And send 
> this file to all nodes.
> But I found reduce number is still default value,200.
> Does anyone else encouter this problem? can anyone give some advice?
> 
> ############
> [Stage 9:>                                                        (0 + 0) / 
> 200]
> [Stage 9:>                                                        (0 + 2) / 
> 200]
> [Stage 9:>                                                        (1 + 2) / 
> 200]
> [Stage 9:>                                                        (2 + 2) / 
> 200]
> #######
> 
> And this results in many empty files.Because my data is little, only some of 
> the 200 files have data.
> #######
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00000
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00001
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00002
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00003
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00004
>  2016-05-20 17:01 
> /warehouse/dmpv3.db/datafile/tmp/output/userprofile/20160519/part-00005
> ########
> 
> 
>

Re: Spark.default.parallelism can not set reduce number

Reply via email to