Hi: I would like to sort historical data using the dataset api.
env.setParallelism(10) val dataset = [(Long, String)] .. .paritionByRange(_._1) .sortPartition(_._1, Order.ASCEDING) .writeAsCsv("mydata.csv").setParallelism(1) the data is out of order (in local order) but .print() prints the data in to correct order. I have run a small toy sample multiple times. Is there a way to sort the entire dataset with parallelism > 1 and write it to a single file in ascending order?