dataset sort

2018-02-08 Thread david westwood
Hi: I would like to sort historical data using the dataset api. env.setParallelism(10) val dataset = [(Long, String)] .. .paritionByRange(_._1) .sortPartition(_._1, Order.ASCEDING) .writeAsCsv("mydata.csv").setParallelism(1) the data is out of order (in local order) but .print() prints the data

Re: dataset sort

2018-02-09 Thread david westwood
partitions to lose its sort order >> because the individual partitions are read in a non deterministic order. >> >> Cheers, >> Till >> >> >> On Thu, Feb 8, 2018 at 8:07 PM, david westwood < >> david.d.westw...@gmail.com> wrote: >> >>&