Is this what are you looking for ?
In Shark, default reducer number is 1 and is controlled by the property
mapred.reduce.tasks. Spark SQL deprecates this property in favor
ofspark.sql.shuffle.partitions, whose default value is 200. Users may customize
this property via SET:
SET
Thanks for the useful comment. But I guess this setting applies only when I
use SparkSQL right= is there any similar settings for Spark?
best,
/Shahab
On Tue, Oct 28, 2014 at 2:38 PM, Wanda Hawk wanda_haw...@yahoo.com wrote:
Is this what are you looking for ?
In Shark, default reducer
In Spark, certain functions have an optional parameter to determine the
number of partitions (distinct, textFile, etc..). You can also use the
coalesce () or repartiton() functions to change the number of partitions
for your RDD. Thanks.
On Oct 28, 2014 9:58 AM, shahab shahab.mok...@gmail.com