[ 
https://issues.apache.org/jira/browse/HUDI-6479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ethan Guo updated HUDI-6479:
----------------------------
    Fix Version/s: 0.16.0

> Update release docs and quick start guide around INSERT_INTO default behavior 
> change 
> -------------------------------------------------------------------------------------
>
>                 Key: HUDI-6479
>                 URL: https://issues.apache.org/jira/browse/HUDI-6479
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: spark-sql
>            Reporter: sivabalan narayanan
>            Assignee: Shiyan Xu
>            Priority: Major
>             Fix For: 0.15.0, 0.16.0
>
>
> With [this|https://github.com/apache/hudi/pull/9123] patch, we are also 
> switching the default behavior with INSERT_INTO to use "insert" as the 
> operation underneath. Until 0.13.1, default behavior was "upsert". In other 
> words, if you ingest same batch of records in commit1 and in commit2, hudi 
> will do an upsert and will return only the latest value with snapshot read. 
> But with this patch, we are changing the default behavior to use "insert" as 
> the name (INSERT_INTO) signifies. So, ingesting the same batch of records in 
> commit1 and in commit2 will result in duplicates records with snapshot read. 
> If users override the respective config, we will honor them, but the default 
> behavior where none of the respective configs are overridden explicitly, will 
> see a behavior change.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to