Sean, No, I don't want to sort the whole RDD, sortByKey seems to be good enough for that.
Right now, I think the code I have will work for me, but I can imagine conditions where it will run out of memory. I'm not completely sure if SPARK-983 <https://issues.apache.org/jira/browse/SPARK-983> Andrew mentioned covers the rdd.sortPartitions() use case. Can someone comment on the scope of SPARK-983? Thanks! ----- -- Madhu https://www.linkedin.com/in/msiddalingaiah -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Sorting-partitions-in-Java-tp6715p6725.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com.