[ 
https://issues.apache.org/jira/browse/HUDI-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sivabalan narayanan updated HUDI-5431:
--------------------------------------
    Sprint:   (was: 0.13.0 Final Sprint)

> Fix rolling back of partially failed writes for all code paths in MDT write 
> flow
> --------------------------------------------------------------------------------
>
>                 Key: HUDI-5431
>                 URL: https://issues.apache.org/jira/browse/HUDI-5431
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metadata
>            Reporter: sivabalan narayanan
>            Assignee: sivabalan narayanan
>            Priority: Blocker
>             Fix For: 0.13.0
>
>
> In SparkHoodieBackedTableMetadataWriter
> {code:java}
> if (!metadataMetaClient.getActiveTimeline().containsInstant(instantTime)) {
>   // if this is a new commit being applied to metadata for the first time
>   writeClient.startCommitWithTime(instantTime);
> } else {
>   Option<HoodieInstant> alreadyCompletedInstant = 
> metadataMetaClient.getActiveTimeline().filterCompletedInstants().filter(entry 
> -> entry.getTimestamp().equals(instantTime)).lastInstant();
>   if (alreadyCompletedInstant.isPresent()) {
>     // this code path refers to a re-attempted commit that got committed to 
> metadata table, but failed in datatable.
>     // for eg, lets say compaction c1 on 1st attempt succeeded in metadata 
> table and failed before committing to datatable.
>     // when retried again, data table will first rollback pending compaction. 
> these will be applied to metadata table, but all changes
>     // are upserts to metadata table and so only a new delta commit will be 
> created.
>     // once rollback is complete, compaction will be retried again, which 
> will eventually hit this code block where the respective commit is
>     // already part of completed commit. So, we have to manually remove the 
> completed instant and proceed.
>     // and it is for the same reason we enabled 
> withAllowMultiWriteOnSameInstant for metadata table.
>     HoodieActiveTimeline.deleteInstantFile(metadataMetaClient.getFs(), 
> metadataMetaClient.getMetaPath(), alreadyCompletedInstant.get());
>     metadataMetaClient.reloadActiveTimeline();
>   }
>   // If the alreadyCompletedInstant is empty, that means there is a requested 
> or inflight
>   // instant with the same instant time.  This happens for data table clean 
> action which
>   // reuses the same instant time without rollback first.  It is a no-op here 
> as the
>   // clean plan is the same, so we don't need to delete the requested and 
> inflight instant
>   // files in the active timeline.
> } {code}
>  
> we missed to rollback partially failed commit in else block.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to