subject:"How can number of partitions be set in spark\-env.sh\?"

Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread Wanda Hawk

Is this what are you looking for ? In Shark, default reducer number is 1 and is controlled by the property mapred.reduce.tasks. Spark SQL deprecates this property in favor ofspark.sql.shuffle.partitions, whose default value is 200. Users may customize this property via SET: SET

Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread shahab

Thanks for the useful comment. But I guess this setting applies only when I use SparkSQL right= is there any similar settings for Spark? best, /Shahab On Tue, Oct 28, 2014 at 2:38 PM, Wanda Hawk wanda_haw...@yahoo.com wrote: Is this what are you looking for ? In Shark, default reducer

Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread Ilya Ganelin

In Spark, certain functions have an optional parameter to determine the number of partitions (distinct, textFile, etc..). You can also use the coalesce () or repartiton() functions to change the number of partitions for your RDD. Thanks. On Oct 28, 2014 9:58 AM, shahab shahab.mok...@gmail.com