[
https://issues.apache.org/jira/browse/HUDI-6114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HUDI-6114:
---------------------------------
Labels: pull-request-available (was: )
> Rollback handling in AbstractHoodieLogRecordReader may not work correctly
> when multi-writer is enabled
> ------------------------------------------------------------------------------------------------------
>
> Key: HUDI-6114
> URL: https://issues.apache.org/jira/browse/HUDI-6114
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Prashant Wason
> Assignee: Prashant Wason
> Priority: Major
> Labels: pull-request-available
>
> When a ROLLBACK command block is encountered, only the last log block is
> potentially rolled back. This may not work in case of multi-writers where the
> rollback may be aplicable to an older block.
> E.g. Assume two processed P1 and P2 which are writing data to the MOR table.
> P1 started at time t1 and P2 started at t2. Lets assume P1 writes the log
> block and then p2 writes the log block.
>
> So the log file has two blocks now [LBlock1(instantTime=t1),
> LBlock2(instantTime=t2)]
> If the P1 failed after writing to log file but before the commit could be
> created, the inflight commit at t1 would eventually be rolled back. In that
> case a rollback block will be written. The log file would look like this:
> [LBlock1(instantTime=t1), LBlock2(instantTime=t2), LBlock(Rollback block with
> targetInstantTime=t1)]
>
> The current AbstractHoodieLogRecordReader code will not rollback LBlock1 as
> it only applies rollback to the last block.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)