[jira] [Closed] (HUDI-2751) To avoid the duplicates for streaming read MOR table

Danny Chen (Jira) Sun, 30 Apr 2023 20:41:05 -0700


     [ 
https://issues.apache.org/jira/browse/HUDI-2751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Danny Chen closed HUDI-2751.
----------------------------
    Fix Version/s: 0.12.0
                   0.11.0
       Resolution: Fixed

> To avoid the duplicates for streaming read MOR table
> ----------------------------------------------------
>
>                 Key: HUDI-2751
>                 URL: https://issues.apache.org/jira/browse/HUDI-2751
>             Project: Apache Hudi
>          Issue Type: Improvement
>          Components: Common Core
>            Reporter: Danny Chen
>            Assignee: sivabalan narayanan
>            Priority: Critical
>             Fix For: 0.12.0, 0.11.0
>
>
> Imagine there are commits on the timeline:
> {noformat}
>                          -----delta-99 ----- commit 100(include 99 delta data 
> set) ----- delta-101 ----- delta-102 -----
>                           first read ->| second read ->
>                          – range 1 ---| ----------------------range 2 
> -------------------|
> {noformat}
> instant 99, 101, 102 are successful non-compaction delta commits;
> instant 100 is successful compaction instant.
> The first inc read consumes to instant 99 and the second read consumes from 
> instant 100 to instant 102, the second read would consumes the commit files 
> of instant 100 which has already been consumed before.
> The duplicate reading happens when this condition triggers: a compaction 
> instant schedules then completes in *one* consume range.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (HUDI-2751) To avoid the duplicates for streaming read MOR table

Reply via email to