repartitionAndSortWithinPartit sort by keys, not values per key, so not really secondary sort by itself.
for secondary sort also check out: https://github.com/tresata/spark-sorted On Thu, Jul 14, 2016 at 1:09 PM, Punit Naik <naik.puni...@gmail.com> wrote: > Hi guys > > In my spark/scala code I am implementing secondary sort. I wanted to know, > when I call the "repartitionAndSortWithinPartitions" method, the whole > (entire) RDD will be sorted or only the individual partitions will be > sorted? > If its the latter case, will applying a "sortByKey" after > "repartitionAndSortWithinPartitions" be faster now that the individual > partitions are sorted? > > -- > Thank You > > Regards > > Punit Naik >