okay you caught me on this.. I havnt used python api. Lets try http://www.cs.berkeley.edu/~pwendell/strataconf/api/pyspark/pyspark.rdd.RDD-class.html#partitionByon the rdd & customize the partitioner instead of hash to a custom function. Please update on the list if it works, it seems to be a common problem.
Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi On Mon, Feb 24, 2014 at 9:23 PM, zhaoxw12 <zhaox...@mails.tsinghua.edu.cn>wrote: > Thanks for your reply. > For some reasons, I have to use python in my program. I can't find the API > of RangePartitioner. Could you tell me more details? > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-well-distribute-partition-tp2002p2013.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >