subject:"\"Spark SQL overwrite\\\/append for partitioned tables\""

Re: Spark SQL overwrite/append for partitioned tables

2016-07-25 Thread Yash Sharma

Correction - dataDF.write.partitionBy(“year”, “month”, “date”).mode(SaveMode.Append).text(“s3://data/test2/events/”) On Tue, Jul 26, 2016 at 10:59 AM, Yash Sharma wrote: > Based on the behavior of spark [1], Overwrite mode will delete all your > data when you try to overwrite a particular partit

Re: Spark SQL overwrite/append for partitioned tables

2016-07-25 Thread Yash Sharma

Based on the behavior of spark [1], Overwrite mode will delete all your data when you try to overwrite a particular partition. What I did- - Use S3 api to delete all partitions - Use spark df to write in Append mode [2] 1. http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-deletes-a

Re: Spark SQL overwrite/append for partitioned tables

2016-07-25 Thread Pedro Rodriguez

Probably should have been more specific with the code we are using, which is something like val df = df.write.mode("append or overwrite here").partitionBy("date").saveAsTable("my_table") Unless there is something like what I described on the native API, I will probably take the approach of h

Re: Spark SQL overwrite/append for partitioned tables

2016-07-25 Thread RK Aduri

You can have a temporary file to capture the data that you would like to overwrite. And swap that with existing partition that you would want to wipe the data away. Swapping can be done by simple rename of the partition and just repair the table to pick up the new partition. Am not sure if that

Spark SQL overwrite/append for partitioned tables

2016-07-25 Thread Pedro Rodriguez

What would be the best way to accomplish the following behavior: 1. There is a table which is partitioned by date 2. Spark job runs on a particular date, we would like it to wipe out all data for that date. This is to make the job idempotent and lets us rerun a job if it failed without fear of dup

Re: Spark SQL overwrite/append for partitioned tables

Re: Spark SQL overwrite/append for partitioned tables

Re: Spark SQL overwrite/append for partitioned tables

Re: Spark SQL overwrite/append for partitioned tables

Spark SQL overwrite/append for partitioned tables

5 matches

Site Navigation

Mail list logo

Footer information