[ 
https://issues.apache.org/jira/browse/HUDI-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Wason closed HUDI-2016.
--------------------------------
      Assignee: Prashant Wason
    Resolution: Fixed

> Metadata table bootstrap does not work when there are inflight instances
> ------------------------------------------------------------------------
>
>                 Key: HUDI-2016
>                 URL: https://issues.apache.org/jira/browse/HUDI-2016
>             Project: Apache Hudi
>          Issue Type: Bug
>            Reporter: Prashant Wason
>            Assignee: Prashant Wason
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.9.0
>
>
> There is a race condition in metadata table bootstrap when there are inflight 
> instances.
> Example: Assume a CLEAN is in progress which is planning to delete 
> p1/f1.parquet (as per clean plan). If bootstrap is going on at the same time, 
> there are two cases possible:
>  # bootstrap lists files in partition p1 BEFORE clean deletes them
>  ## hence p1/f1.parquet is added to metadata table during bootstrap
>  ## When processing the CLEAN, p1/f1.parquet will be deleted from metadata 
> table
>  # bootstrap lists files in partition p1 AFTER clean deletes them
>  ## p1/f1.parquet is not found
>  ## When processing the CLEAN, p1/f1.parquet will be deleted from metadata 
> table
> We cannot differenciate 2.2 from the case that we missed adding p1/f1.parquet 
> to the metadata table.
> There is an exception in the metadata reader code to ensure that that any 
> file being deleted was added to the metadata table. This exception is throws 
> in case 2.2 above.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to