Hi folks,

We are trying to do
`
df.coalesce(1000).sortWithinPartitions("col1").write.mode('overwrite').partitionBy("col2").parquet(...)
`

I do see that coalesce 1000 is applied for every sub partition. But I
wanted to know if sortWithinPartitions(col1) works after applying
partitionBy or before? Basically would spark first partitionBy col2 and
then sort by col1 or sort first and then partition?

Thanks
Nikhil

Reply via email to