kazdy opened a new issue, #7266:
URL: https://github.com/apache/hudi/issues/7266

   **Describe the problem you faced**
   
   spark sql insert updates the whole record if the record with same PK already 
exists in hudi table
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   ```
   create table hudi_cow_nonpcf_tbl (
     uuid int,
     name string,
     price double
   ) using hudi;
   
   # first insert
   insert into hudi_cow_nonpcf_tbl select 1, ‘a1’, 20;
   
   select * from hudi_cow_nonpcf_tbl;
   
   # returns
   1    a1      20.0
   
   # another insert with the same key, different values:
   insert into hudi_cow_nonpcf_tbl select 1, ‘a2’, 30;
   
   select * from hudi_cow_nonpcf_tbl;
   # returns
   1    a2      30.0
   ```
   There's no difference in behaviour when precombine field is specified.
   
   **Expected behavior**
   
   Shouldn't the second insert fail if record with same key already exists?
   
   **Environment Description**
   
   * Hudi version : 0.12.1
   
   * Spark version : 3.2.1
   
   * Hive version : none
   
   * Hadoop version : unknown
   
   * Storage (HDFS/S3/GCS..) : local filesystem
   
   * Running on Docker? (yes/no) : no
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to