[
https://issues.apache.org/jira/browse/HUDI-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aditya Goenka closed HUDI-6410.
-------------------------------
Resolution: Fixed
It was not a issue but due to typo.
> MERGE INTO giving duplicate rows even if table have precombineKey
> -----------------------------------------------------------------
>
> Key: HUDI-6410
> URL: https://issues.apache.org/jira/browse/HUDI-6410
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Aditya Goenka
> Priority: Blocker
> Fix For: 0.14.0
>
> Attachments: image-2023-06-19-15-16-58-055.png,
> image-2023-06-19-15-37-27-202.png
>
>
> Merge into is giving duplicate rows even if precombine key is there.
>
> Example -
> spark-sql> create table spark_mor_no_pre_t5 (
> > id int,
> > name string,
> > updated_at timestamp
> > ) using hudi
> > options (
> > type = 'mor',
> > primaryKey = 'id',
> > precombineKey = 'updated_at'
> > ) location 'file:///tmp/output/spark_mor_no_pre_t4';
> Time taken: 0.363 seconds
> spark-sql>
> > merge into spark_mor_no_pre_t5 as target
> > using (
> > select 1 as id, 'c' as name, current_timestamp as updated_at
> > union select 1 as id,'d' as name, current_timestamp as updated_at
> > union select 1 as id,'e' as name, current_timestamp as updated_at
> > ) source
> > on target.id = source.id
> > when matched then update set *
> > when not matched then insert *;
> Time taken: 3.111 seconds
> spark-sql> select * from spark_mor_no_pre_t5;
> 20230619151501003 20230619151501003_0_0 1
> 4405350d-edd6-465b-ac43-8a68d26f957e-0_0-245-274_20230619151501003.parquet 1
> e 2023-06-19 15:15:01.032766
> 20230619151501003 20230619151501003_0_1 1
> 4405350d-edd6-465b-ac43-8a68d26f957e-0_0-245-274_20230619151501003.parquet 1
> e 2023-06-19 15:15:01.032766
> 20230619151501003 20230619151501003_0_2 1
> 4405350d-edd6-465b-ac43-8a68d26f957e-0_0-245-274_20230619151501003.parquet 1
> e 2023-06-19 15:15:01.032766
> Time taken: 0.288 seconds, Fetched 3 row(s)
>
> Github Issue - [https://github.com/apache/hudi/issues/8916]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)