Hi All,
While writing a partitioned data frame as partitioned text files I see that
Spark deletes all available partitions while writing few new partitions.

dataDF.write.partitionBy(“year”, “month”,
> “date”).mode(SaveMode.Overwrite).text(“s3://data/test2/events/”)


Is this an expected behavior ?

I have a past correction job which would overwrite couple of past
partitions based on new arriving data. I would only want to remove those
partitions.

Is there a neater way to do that other than:
- Find the partitions
- Delete using Hadoop API's
- Write DF in Append Mode


Cheers
Yash

Reply via email to