Is there an api in Dataset/Dataframe that does repartitionAndSortWithinPartitions?

Keith Chapman Fri, 23 Jun 2017 14:43:21 -0700

Hi,

I have code that does the following using RDDs,


val outputPartitionCount = 300
val part = new MyOwnPartitioner(outputPartitionCount)
val finalRdd = myRdd.repartitionAndSortWithinPartitions(part)

where myRdd is correctly formed as key, value pairs. I am looking convert
this to use Dataset/Dataframe instead of RDDs, so my question is:

Is there custom partitioning of Dataset/Dataframe implemented in Spark?
Can I accomplish the partial sort using mapPartitions on the resulting
partitioned Dataset/Dataframe?

Any thoughts?

Regards,
Keith.

http://keith-chapman.com

Is there an api in Dataset/Dataframe that does repartitionAndSortWithinPartitions?

Reply via email to