subject:"Spark deletes all existing partitions in SaveMode.Overwrite \- Expected behavior \?"

Re: Spark deletes all existing partitions in SaveMode.Overwrite - Expected behavior ?

2020-08-19 Thread golokeshpatra.patra

Adding this simple setting helped me overcome the issue - *spark.conf.set("spark.sql.sources.partitionOverwriteMode","dynamic") * My Issue - In a S3 Folder, I previously had data partitionedBy - *ingestiontime* . Now I wanted to reprocess this data and partition it by - businessname &

Re: Spark deletes all existing partitions in SaveMode.Overwrite - Expected behavior ?

2016-07-06 Thread nirandap

Hi Yash, Yes, AFAIK, that is the expected behavior of the Overwrite mode. I think you can use the following approaches if you want to perform a job on each partitions [1] for each partition in DF :

Spark deletes all existing partitions in SaveMode.Overwrite - Expected behavior ?

2016-07-06 Thread Yash Sharma

Hi All, While writing a partitioned data frame as partitioned text files I see that Spark deletes all available partitions while writing few new partitions. dataDF.write.partitionBy(“year”, “month”, > “date”).mode(SaveMode.Overwrite).text(“s3://data/test2/events/”) Is this an expected behavior