peanut-chenzhong opened a new issue, #13227:
URL: https://github.com/apache/hudi/issues/13227

   **_Tips before filing an issue_**
   
   - Have you gone through our [FAQs](https://hudi.apache.org/learn/faq/)?
   
   - Join the mailing list to engage in conversations and get faster support at 
[email protected].
   
   - If you have triaged this as a bug, then file an 
[issue](https://issues.apache.org/jira/projects/HUDI/issues) directly.
   
   **Describe the problem you faced**
   
   A clear and concise description of the problem.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   run the flowing sql command
   
   create table tb_parquet (id int, comb int, col0 int, col1 bigint, col2 
float, col3 double, col4 decimal(10,4), col5 string, col6 date, col7 timestamp, 
col8 boolean, col9 binary, par date) using parquet;
   insert into tb_parquet values
   
(1,1,11,100001,101.01,1001.0001,100001.0001,'a000001','2021-12-25','2021-12-25 
12:01:01',true,'a01','2021-12-25'),
   
(2,2,12,100002,102.02,1002.0002,100002.0002,'a000002','2021-12-25','2021-12-25 
12:02:02',true,'a02','2021-12-25'),
   
(3,3,13,100003,103.03,1003.0003,100003.0003,'a000003','2021-12-25','2021-12-25 
12:03:03',false,'a03','2021-12-25'),
   
(4,4,14,100004,104.04,1004.0004,100004.0004,'a000004','2021-12-26','2021-12-26 
12:04:04',true,'a04','2021-12-26'),
   
(5,5,15,100005,105.05,1005.0005,100005.0005,'a000005','2021-12-26','2021-12-26 
12:05:05',false,'a05','2021-12-26');
   
   create table tb_parquet2 (id int, comb int, col0 int, col1 bigint, col2 
float, col3 double, col4 decimal(10,4), col5 string, col6 date, col7 timestamp, 
col8 boolean, col9 binary, par date) using parquet;
   insert into tb_parquet2 values
   
(3,30,130,133333,133.33,1333.3333,133333.3333,'aaaaaa3','2021-12-25','2021-12-25
 12:33:33',true,'a33','2021-12-25'),
   
(5,50,150,100555,105.55,1055.0055,100555.0055,'a000555','2021-12-26','2021-12-26
 12:55:05',false,'a55','2021-12-26'),
   
(6,6,16,100006,106.06,1006.0006,100006.0006,'a000006','2021-12-27','2021-12-27 
12:07:07',false,'a07','2021-12-27');
   
   
   
   drop table if exists hudi_mor_s;
   drop table if exists hudi_mor_s_rt;
   drop table if exists hudi_mor_s_ro;
   create table hudi_mor_s (id int, comb int, col0 int, col1 bigint, col2 
float, col3 double, col4 decimal(10,4), col5 string, col6 date, col7 timestamp, 
col8 boolean, col9 binary, par date) using hudi partitioned by(par) 
options(type='mor', primaryKey='id', preCombineField='comb');
   insert into hudi_mor_s select * from tb_parquet;
   delete from hudi_mor_s where id = 3;
   merge into hudi_mor_s t1 using tb_parquet2 t2 on t1.id = t2.id when matched 
then update set id=t2.id, comb=t2.comb, col0=t2.col0+1, col1=t2.col1, 
col2=t2.col2, col3=t2.col3, col4=t2.col4, col5='aaaa', col6=t2.col6, 
col7=t2.col7, col8=t2.col8, col9=t2.col9, par=t2.par when not matched then 
insert *;
   
   select * from hudi_mor_s order by id;
   
   **Expected behavior**
   
   since record id=3 is deleted before merge into, it should be insert as what 
it is in tb_parquet2, but now it go into the update logic and col0 value 
changed from tb_parquet2.
   
   **Environment Description**
   
   * Hudi version : test from 0.11-0 to master branch
   
   * Spark version : 3.3.1
   
   * Hive version :3.3.1
   
   * Hadoop version :3.3.1
   
   * Storage (HDFS/S3/GCS..) :HDFS
   
   * Running on Docker? (yes/no) :no
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to