Hi all, Is there a way to use dataframe.partitionBy("col") and control the number of output files without doing a full repartition? The thing is some partitions have more data while some have less. Doing a .repartition is a costly operation. We want to control the size of the output files. Is it even possible?
Thanks