Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread Wanda Hawk
Is this what are you looking for ? In Shark, default reducer number is 1 and is controlled by the property mapred.reduce.tasks. Spark SQL deprecates this property in favor ofspark.sql.shuffle.partitions, whose default value is 200. Users may customize this property via SET: SET

Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread shahab
Thanks for the useful comment. But I guess this setting applies only when I use SparkSQL right= is there any similar settings for Spark? best, /Shahab On Tue, Oct 28, 2014 at 2:38 PM, Wanda Hawk wanda_haw...@yahoo.com wrote: Is this what are you looking for ? In Shark, default reducer

Re: How can number of partitions be set in spark-env.sh?

2014-10-28 Thread Ilya Ganelin
In Spark, certain functions have an optional parameter to determine the number of partitions (distinct, textFile, etc..). You can also use the coalesce () or repartiton() functions to change the number of partitions for your RDD. Thanks. On Oct 28, 2014 9:58 AM, shahab shahab.mok...@gmail.com