Thanks for the pointer Saliya, I'm looking got an equivalent api in
dataset/dataframe for repartitionAndSortWithinPartitions, I've already
converted most of the RDD's to Dataframes.

Regards,
Keith.

http://keith-chapman.com

On Sat, Jun 24, 2017 at 3:48 AM, Saliya Ekanayake <esal...@gmail.com> wrote:

> I haven't worked with datasets but would this help https://stackoverflow.
> com/questions/37513667/how-to-create-a-spark-dataset-from-an-rdd?
>
> On Jun 23, 2017 5:43 PM, "Keith Chapman" <keithgchap...@gmail.com> wrote:
>
>> Hi,
>>
>> I have code that does the following using RDDs,
>>
>> val outputPartitionCount = 300
>> val part = new MyOwnPartitioner(outputPartitionCount)
>> val finalRdd = myRdd.repartitionAndSortWithinPartitions(part)
>>
>> where myRdd is correctly formed as key, value pairs. I am looking convert
>> this to use Dataset/Dataframe instead of RDDs, so my question is:
>>
>> Is there custom partitioning of Dataset/Dataframe implemented in Spark?
>> Can I accomplish the partial sort using mapPartitions on the resulting
>> partitioned Dataset/Dataframe?
>>
>> Any thoughts?
>>
>> Regards,
>> Keith.
>>
>> http://keith-chapman.com
>>
>

Reply via email to