Hi, I have code that does the following using RDDs,
val outputPartitionCount = 300 val part = new MyOwnPartitioner(outputPartitionCount) val finalRdd = myRdd.repartitionAndSortWithinPartitions(part) where myRdd is correctly formed as key, value pairs. I am looking convert this to use Dataset/Dataframe instead of RDDs, so my question is: Is there custom partitioning of Dataset/Dataframe implemented in Spark? Can I accomplish the partial sort using mapPartitions on the resulting partitioned Dataset/Dataframe? Any thoughts? Regards, Keith. http://keith-chapman.com