kazdy created HUDI-5260:
---------------------------

             Summary: Insert into sql with strict insert mode and no 
preCombineField should not overwrite existing records
                 Key: HUDI-5260
                 URL: https://issues.apache.org/jira/browse/HUDI-5260
             Project: Apache Hudi
          Issue Type: Improvement
            Reporter: kazdy
            Assignee: kazdy
             Fix For: 0.12.2


Spark sql insert updates the whole record if the record with same PK already 
exists in hudi table that has no preCombineField specified and strict insert 
mode is used.

To Reproduce

Steps to reproduce the behavior:

create table hudi_cow_nonpcf_tbl (
  uuid int,
  name string,
  price double
) using hudi;

set hoodie.sql.insert.mode=strict;

# first insert
insert into hudi_cow_nonpcf_tbl select 1, ‘a1’, 20;

select * from hudi_cow_nonpcf_tbl;

# returns
1    a1    20.0

# another insert with the same key, different values:
insert into hudi_cow_nonpcf_tbl select 1, ‘a2’, 30;

select * from hudi_cow_nonpcf_tbl;
# returns
1    a2    30.0
Expected behavior

There's a difference in behavior when precombine field is specified and Hudi 
throws an error.
I would expect the second insert fail if a record with the same key already 
exists when precombine field is not specified and strict insert mode is enabled.

https://github.com/apache/hudi/issues/7266



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to