Hi:
I would like to sort historical data using the dataset api.
env.setParallelism(10)
val dataset = [(Long, String)] ..
.paritionByRange(_._1)
.sortPartition(_._1, Order.ASCEDING)
.writeAsCsv("mydata.csv").setParallelism(1)
the data is out of order (in local order)
but
.print()
prints the data
partitions to lose its sort order
>> because the individual partitions are read in a non deterministic order.
>>
>> Cheers,
>> Till
>>
>>
>> On Thu, Feb 8, 2018 at 8:07 PM, david westwood <
>> david.d.westw...@gmail.com> wrote:
>>
>>&