[
https://issues.apache.org/jira/browse/HUDI-5434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-5434:
--------------------------------------
Description:
as of now, archival in MDT is guarded until first entry in DT's active
timeline. but DT could contain rollback that could date back few days or even
weeks. So, we need to fix that to check for first write action in DT (commit,
delta commit, replace commit) and then guard MDT archival based on that.
Impact:
could result in huge no of entries in active timeline in MDT. might hamper perf
or throttling in cloud stores.
was:as of now, archival in MDT is guarded until first entry in DT's active
timeline. but DT could contain rollback that could date back few days or even
weeks. So, we need to fix that to check for first write action in DT (commit,
delta commit, replace commit) and then guard MDT archival based on that.
> Fix archival in MDT to not rely on rollbacks/clean in DT
> --------------------------------------------------------
>
> Key: HUDI-5434
> URL: https://issues.apache.org/jira/browse/HUDI-5434
> Project: Apache Hudi
> Issue Type: Bug
> Components: metadata
> Reporter: sivabalan narayanan
> Assignee: Ethan Guo
> Priority: Blocker
> Fix For: 0.13.0
>
>
> as of now, archival in MDT is guarded until first entry in DT's active
> timeline. but DT could contain rollback that could date back few days or even
> weeks. So, we need to fix that to check for first write action in DT (commit,
> delta commit, replace commit) and then guard MDT archival based on that.
>
> Impact:
> could result in huge no of entries in active timeline in MDT. might hamper
> perf or throttling in cloud stores.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)