[
https://issues.apache.org/jira/browse/HUDI-6153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17718734#comment-17718734
]
sivabalan narayanan commented on HUDI-6153:
-------------------------------------------
1 FG 5 versions.
1 clean, which cleaned up first 1 version.
we have 4 slices.
C1(cleaned up), C2,... C5. c6.clean
MDT: will have 5 files.
after C6 is applied to MDT,
only 4 files (c2 ... c5)
restore to C4.
MDT will roll back to C4.
we don't re-apply or negate C6 since rollback is applicable only for write
timeline.
so we need to re-apply the cleaned commits.
> Change the rollback mechanism for MDT to actual rollbacks rather than
> appending revert blocks
> ---------------------------------------------------------------------------------------------
>
> Key: HUDI-6153
> URL: https://issues.apache.org/jira/browse/HUDI-6153
> Project: Apache Hudi
> Issue Type: Improvement
> Components: metadata
> Reporter: Prashant Wason
> Assignee: Prashant Wason
> Priority: Major
> Fix For: 0.14.0, 1.0.0
>
>
> When rolling back completed commits for indexes like record-index, the list
> of all keys removed from the dataset is required. This information cannot be
> available during rollback processing in MDT since the files have already been
> deleted during the rollback inflight processing.
> Hence, the current MDT rollback mechanism of adding -files, -col_stats
> entries does not work for record index.
> This PR changes the rollback mechanism to actually rollback deltacommits on
> the MDT. This makes the rollback handing faster and keeps the MDT in sync
> with dataset.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)