I haven't worked with datasets but would this help https://stackoverflow.com/questions/37513667/how-to-create-a-spark-dataset-from-an-rdd ?
On Jun 23, 2017 5:43 PM, "Keith Chapman" <keithgchap...@gmail.com> wrote: > Hi, > > I have code that does the following using RDDs, > > val outputPartitionCount = 300 > val part = new MyOwnPartitioner(outputPartitionCount) > val finalRdd = myRdd.repartitionAndSortWithinPartitions(part) > > where myRdd is correctly formed as key, value pairs. I am looking convert > this to use Dataset/Dataframe instead of RDDs, so my question is: > > Is there custom partitioning of Dataset/Dataframe implemented in Spark? > Can I accomplish the partial sort using mapPartitions on the resulting > partitioned Dataset/Dataframe? > > Any thoughts? > > Regards, > Keith. > > http://keith-chapman.com >