When I do the following, Spark( 2.4) doesn't put _SUCCESS file in the partition directory:
val outputPath = s"s3://mybucket/$table" df .orderBy(time) .coalesce(numFiles) .write .partitionBy("partitionDate") .mode("overwrite") .format("parquet") .save(outputPath) But when I remove 'partitionBy' & add partition info in the outputPath as shown below, I do see the _SUCCESS file. *Questions:* 1) Is the following solution acceptable? 2) Would this cause problems elsewhere if I don't use the 'partitionBy' clause? 3) Is there a better way to ensure that _SUCCESS file is created in each partition? val outputPath = s"s3://mybucket/$table/date=<some date>" df .orderBy(time) .coalesce(numFiles) .write .mode("overwrite") .format("parquet") .save(outputPath)