[
https://issues.apache.org/jira/browse/HUDI-2917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-2917:
--------------------------------------
Sprint: Cont' improve - 2021/01/10, Cont' improve - 2021/01/18, Cont'
improve - 2021/01/24 (was: Cont' improve - 2021/01/10, Cont' improve -
2021/01/18)
> Rollback may be incorrect for canIndexLogFile index
> ---------------------------------------------------
>
> Key: HUDI-2917
> URL: https://issues.apache.org/jira/browse/HUDI-2917
> Project: Apache Hudi
> Issue Type: Bug
> Components: Common Core
> Reporter: ZiyueGuan
> Assignee: ZiyueGuan
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.11.0
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Problem:
> we may find some data which should be rollbacked in hudi table.
> Root cause:
> Let's first recall how rollback plan generated about log blocks for
> deltaCommit. Hudi takes two cases into consideration.
> # For some log file with no base file, they are comprised by records which
> are all 'insert record'. Delete them directly. Here we assume all inserted
> record should be covered by this way.
> # For those fileID which are updated according to inflight commit meta of
> instant we want to rollback, we append command block to these log file to
> rollback. Here all updated record are handled.
> However, the first condition is not always true. For indexes which can index
> log file, they could insert record to some existing log file. In current
> process, inflight hoodieCommitMeta was generated before they are assigned to
> specific filegroup.
>
> Fix:
> What's needed to fix this problem, we need to use the result of partitioner
> to generate hoodieCommitMeta rather than workProfile. Also, we may need more
> comments in rollback code to remind this case.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)