sivabalan narayanan created HUDI-6479:
-----------------------------------------

             Summary: Update release docs and quick start guide around 
INSERT_INTO default behavior change 
                 Key: HUDI-6479
                 URL: https://issues.apache.org/jira/browse/HUDI-6479
             Project: Apache Hudi
          Issue Type: Improvement
          Components: spark-sql
            Reporter: sivabalan narayanan


With [this|https://github.com/apache/hudi/pull/9123] patch, we are also 
switching the default behavior with INSERT_INTO to use "insert" as the 
operation underneath. Until 0.13.1, default behavior was "upsert". In other 
words, if you ingest same batch of records in commit1 and in commit2, hudi will 
do an upsert and will return only the latest value with snapshot read. But with 
this patch, we are changing the default behavior to use "insert" as the name 
(INSERT_INTO) signifies. So, ingesting the same batch of records in commit1 and 
in commit2 will result in duplicates records with snapshot read. If users 
override the respective config, we will honor them, but the default behavior 
where none of the respective configs are overridden explicitly, will see a 
behavior change.

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to