[
https://issues.apache.org/jira/browse/HUDI-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Prashant Wason closed HUDI-2016.
--------------------------------
Assignee: Prashant Wason
Resolution: Fixed
> Metadata table bootstrap does not work when there are inflight instances
> ------------------------------------------------------------------------
>
> Key: HUDI-2016
> URL: https://issues.apache.org/jira/browse/HUDI-2016
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Prashant Wason
> Assignee: Prashant Wason
> Priority: Blocker
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> There is a race condition in metadata table bootstrap when there are inflight
> instances.
> Example: Assume a CLEAN is in progress which is planning to delete
> p1/f1.parquet (as per clean plan). If bootstrap is going on at the same time,
> there are two cases possible:
> # bootstrap lists files in partition p1 BEFORE clean deletes them
> ## hence p1/f1.parquet is added to metadata table during bootstrap
> ## When processing the CLEAN, p1/f1.parquet will be deleted from metadata
> table
> # bootstrap lists files in partition p1 AFTER clean deletes them
> ## p1/f1.parquet is not found
> ## When processing the CLEAN, p1/f1.parquet will be deleted from metadata
> table
> We cannot differenciate 2.2 from the case that we missed adding p1/f1.parquet
> to the metadata table.
> There is an exception in the metadata reader code to ensure that that any
> file being deleted was added to the metadata table. This exception is throws
> in case 2.2 above.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)