Re: Spark SaveMode

Jörn Franke Fri, 19 Jul 2019 22:42:04 -0700

This is not an issue of Spark, but the underlying database. The primary key 
constraint has a purpose and ignoring it would defeat that purpose. 
Then to handle your use case, you would need to make multiple decisions that 
may imply you don’t want to simply insert if not exist. Maybe you want to do an 
upsert or how do you want to take into account deleted data?
You could use a Merge in Oracle to achieve what you have in mind. In Spark you 
would need to fetch the data from the Oracle database and then merge it in 
Spark with the new data depending on your requirements.


> Am 20.07.2019 um 06:34 schrieb Richard <fifistorm...@gmail.com>:
> 
> Any reason why Spark's SaveMode doesn't have mode that ignore any Primary 
> Key/Unique constraint violations?
> 
> Let's say I'm using spark to migrate some data from Cassandra to Oracle, I 
> want the insert operation to be "ignore if exist primary keys" instead of 
> failing the whole batch.
> 
> Thanks, 
> Richard 
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark SaveMode

Reply via email to