[
https://issues.apache.org/jira/browse/HUDI-5769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
sivabalan narayanan updated HUDI-5769:
--------------------------------------
Status: In Progress (was: Patch Available)
> Partitions created by Async indexer could be deleted by regular writers
> -----------------------------------------------------------------------
>
> Key: HUDI-5769
> URL: https://issues.apache.org/jira/browse/HUDI-5769
> Project: Apache Hudi
> Issue Type: Sub-task
> Components: metadata
> Reporter: sivabalan narayanan
> Assignee: Sagar Sumit
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 1.0.1
>
> Original Estimate: 8h
> Remaining Estimate: 8h
>
> In regular writer we have a flow, where we detect if some MDT partition is
> not enabled, but the partition is found in storage and as part of table
> config's fully built out partitions, hudi deletes the metadata partition with
> the intent that user wishes to disable it.
> But this does not sit well w/ async indexer.
>
> process1 -> Deltastreamer runs continuously.
> no metadata configs set.
> which means, default value for metadata enable = true and hence "files"
> partition will be instantiated inline on first commit.
> no value set for col stats enable. So, no action will be taken.
>
> process2: user starts HoodieIndexer for col stats partition.
> Once indexer completes, tableConfig will add "col stats" as part of fully
> built out metadata partition.
>
> While in process1, when deltastreamer goes to next write, it will detect that
> col stats wasn't enabled (default value as per code), but tableConfig shows
> that col stats is fully built out, and hence decides to delete the col stats
> partition and updates the tableConfig.
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)