Pardon me - I should be looking at JavaPairRDD, but my point still stands that there's no integer parameter for sortByKey() unlike their Scala counterparts: http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.api.java.JavaPairRDD ________________________________ From: Ashish Rangole [[email protected]] Sent: Monday, December 09, 2013 7:41 PM To: [email protected] Subject: Re: JavaRDD, Specify number of tasks
AFAIK yes. IIRC, there is a 2nd parameter numPartitions that one can provide to these operations. On Dec 9, 2013 8:19 PM, "Matt Cheah" <[email protected]<mailto:[email protected]>> wrote: Hi When I use a JavaPairRDD's groupByKey(), reduceByKey(), or sortByKey(), is there a way for me to specify the number of reduce tasks, as there is in a scala RDD? Or do I have to set them all to use spark.default.parallelism? Thanks, -Matt Cheah (feels like I've been asking a lot of questions as of lateā¦)
