Re: repartitionAndSortWithinPartitions HELP

Koert Kuipers Thu, 14 Jul 2016 10:56:43 -0700

repartitionAndSortWithinPartit sort by keys, not values per key, so not
really secondary sort by itself.


for secondary sort also check out:
https://github.com/tresata/spark-sorted


On Thu, Jul 14, 2016 at 1:09 PM, Punit Naik <naik.puni...@gmail.com> wrote:

> Hi guys
>
> In my spark/scala code I am implementing secondary sort. I wanted to know,
> when I call the "repartitionAndSortWithinPartitions" method, the whole
> (entire) RDD will be sorted or only the individual partitions will be
> sorted?
> If its the latter case, will applying a "sortByKey" after
> "repartitionAndSortWithinPartitions" be faster now that the individual
> partitions are sorted?
>
> --
> Thank You
>
> Regards
>
> Punit Naik
>

Re: repartitionAndSortWithinPartitions HELP

Reply via email to