[
https://issues.apache.org/jira/browse/HUDI-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ethan Guo closed HUDI-4119.
---------------------------
Resolution: Fixed
> the first read result is incorrect when Flink upsert- Kafka connector is
> used in HUDi
> ----------------------------------------------------------------------------------------
>
> Key: HUDI-4119
> URL: https://issues.apache.org/jira/browse/HUDI-4119
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: yanxiang
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.11.1
>
>
> the first read result is incorrect when Flink upsert- Kafka connector is
> used in HUDi .
>
> ETL path: flink upsert-kafka connector -> hudi table (MOR table,query by
> stream)
>
> Here is the case:
>
> 1. the first time: write two records with the same primary key into kafka,
> and insert them into hudi table. the query result should be three records:
> +I first record, -U first record, +U second record; But the first time I
> query hudi table, I found that all the data operation were +I: +I first
> record,+I first record and +I second record, and there was no update
> operation;
> Three times +I has affected hudi's subsequent ETL process-the data of
> groupBy is inaccurate;
> 2. Second time: Exit the first query, restart the query job of hudi table,
> and the query results are normal: +I first data, -U first data, +U second
> data.
>
> Reason:
> Reason:There is a bug in the program. When no data log file is generated, the
> Schema does not include the column' _ hoodie _ operation'.Please refer to the
> following link for details:
> [https://www.jianshu.com/p/29f9ec5e606e]
--
This message was sent by Atlassian Jira
(v8.20.7#820007)