[ 
https://issues.apache.org/jira/browse/HUDI-5686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Purushotham Pushpavanthar updated HUDI-5686:
--------------------------------------------
    Attachment: Screenshot 2023-01-30 at 3.26.57 PM.png

> Missing records when HoodieDeltaStreamer run in continuous
> ----------------------------------------------------------
>
>                 Key: HUDI-5686
>                 URL: https://issues.apache.org/jira/browse/HUDI-5686
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: deltastreamer
>            Reporter: Sagar Sumit
>            Assignee: Purushotham Pushpavanthar
>            Priority: Critical
>             Fix For: 0.13.1
>
>         Attachments: Screenshot 2023-01-30 at 3.26.57 PM.png
>
>
> See issue [https://github.com/apache/hudi/issues/7757] for more details.
> Description of the issue:
> If the HoodieDeltaStreamer is forcefully terminated before commit instant's 
> state is `COMPLETED`, it leaves the commit state in either `REQUESTED` or  
> `INFLIGHT`. When the HoodieDeltaStreamer is rerun, the first successful 
> commit writes first batch of records into Hudi Table. However, in the 
> consecutive commit, the changes committed by previous commit disappears. This 
> causes *loss of entire batch* of data written by the first commit after 
> restart.
> I observed this problem when HoodieDeltaStreamer is run in continuous mode 
> and when job gets resubmitted when AM container gets killed due to reasons 
> like loss of nodes or node going to unhealthy state. This issue is not 
> limited to continuous mode alone, this can happen anytime when Hudi write 
> gets terminated before instant is marked `COMPLETE`.
> How to reproduce the issue:
> # Run HoodieDeltaStreamer and yarn kill the job before commit instant reaches 
> `COMPLETE` state. Note the number of records after last successful commit 
> (say 100)
> # Upon re-submission of HoodieDeltaStreamer, there will be 2 new instants 
> created (1 Commit complete and 1 rollback complete). Note the number of delta 
> changes consumed(say 10 new records keys) in this run and total number of 
> records in hudi table( 110 unique records )
> # On next run, wait till Hudi completes the commit assuming it received 5 
> records and check the count of unique records in hudi table (It was observed 
> to be 105). The delta records consumed in step 2 are entirely lost.
> Reason:
> To explain what happens internally lets take an example of a commit timeline 
> shown below



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to