[
https://issues.apache.org/jira/browse/HUDI-5260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
kazdy updated HUDI-5260:
------------------------
Status: In Progress (was: Open)
> Insert into sql with strict insert mode and no preCombineField should not
> overwrite existing records
> ----------------------------------------------------------------------------------------------------
>
> Key: HUDI-5260
> URL: https://issues.apache.org/jira/browse/HUDI-5260
> Project: Apache Hudi
> Issue Type: Improvement
> Reporter: kazdy
> Assignee: kazdy
> Priority: Minor
> Labels: pull-request-available
> Fix For: 0.12.2
>
>
> Spark sql insert updates the whole record if the record with same PK already
> exists in hudi table that has no preCombineField specified and strict insert
> mode is used.
> To Reproduce
> Steps to reproduce the behavior:
> create table hudi_cow_nonpcf_tbl (
> uuid int,
> name string,
> price double
> ) using hudi;
> set hoodie.sql.insert.mode=strict;
> # first insert
> insert into hudi_cow_nonpcf_tbl select 1, ‘a1’, 20;
> select * from hudi_cow_nonpcf_tbl;
> # returns
> 1 a1 20.0
> # another insert with the same key, different values:
> insert into hudi_cow_nonpcf_tbl select 1, ‘a2’, 30;
> select * from hudi_cow_nonpcf_tbl;
> # returns
> 1 a2 30.0
> Expected behavior
> There's a difference in behavior when precombine field is specified and Hudi
> throws an error.
> I would expect the second insert fail if a record with the same key already
> exists when precombine field is not specified and strict insert mode is
> enabled.
> https://github.com/apache/hudi/issues/7266
--
This message was sent by Atlassian Jira
(v8.20.10#820010)