Hi,

I have code that does the following using RDDs,

val outputPartitionCount = 300
val part = new MyOwnPartitioner(outputPartitionCount)
val finalRdd = myRdd.repartitionAndSortWithinPartitions(part)

where myRdd is correctly formed as key, value pairs. I am looking convert
this to use Dataset/Dataframe instead of RDDs, so my question is:

Is there custom partitioning of Dataset/Dataframe implemented in Spark?
Can I accomplish the partial sort using mapPartitions on the resulting
partitioned Dataset/Dataframe?

Any thoughts?

Regards,
Keith.

http://keith-chapman.com

Reply via email to