It seems are you are already using parititonBy, you can simply plugin in your custom function instead of lambda x:x & it should use that to partition. Range partitioner is available in Scala I am not sure if its exposed directly in python. Regards Mayur
Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi On Tue, Feb 25, 2014 at 11:00 AM, Mayur Rustagi <mayur.rust...@gmail.com>wrote: > okay you caught me on this.. I havnt used python api. > Lets try > http://www.cs.berkeley.edu/~pwendell/strataconf/api/pyspark/pyspark.rdd.RDD-class.html#partitionByon > the rdd & customize the partitioner instead of hash to a custom function. > Please update on the list if it works, it seems to be a common problem. > > Mayur Rustagi > Ph: +919632149971 > h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com > https://twitter.com/mayur_rustagi > > > > On Mon, Feb 24, 2014 at 9:23 PM, zhaoxw12 > <zhaox...@mails.tsinghua.edu.cn>wrote: > >> Thanks for your reply. >> For some reasons, I have to use python in my program. I can't find the API >> of RangePartitioner. Could you tell me more details? >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-well-distribute-partition-tp2002p2013.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> > >