sivabalan narayanan created HUDI-5431:
-----------------------------------------
Summary: Fix rolling back of partially failed writes for all code
paths in MDT write flow
Key: HUDI-5431
URL: https://issues.apache.org/jira/browse/HUDI-5431
Project: Apache Hudi
Issue Type: Bug
Components: metadata
Reporter: sivabalan narayanan
In SparkHoodieBackedTableMetadataWriter
{code:java}
if (!metadataMetaClient.getActiveTimeline().containsInstant(instantTime)) {
// if this is a new commit being applied to metadata for the first time
writeClient.startCommitWithTime(instantTime);
} else {
Option<HoodieInstant> alreadyCompletedInstant =
metadataMetaClient.getActiveTimeline().filterCompletedInstants().filter(entry
-> entry.getTimestamp().equals(instantTime)).lastInstant();
if (alreadyCompletedInstant.isPresent()) {
// this code path refers to a re-attempted commit that got committed to
metadata table, but failed in datatable.
// for eg, lets say compaction c1 on 1st attempt succeeded in metadata
table and failed before committing to datatable.
// when retried again, data table will first rollback pending compaction.
these will be applied to metadata table, but all changes
// are upserts to metadata table and so only a new delta commit will be
created.
// once rollback is complete, compaction will be retried again, which will
eventually hit this code block where the respective commit is
// already part of completed commit. So, we have to manually remove the
completed instant and proceed.
// and it is for the same reason we enabled
withAllowMultiWriteOnSameInstant for metadata table.
HoodieActiveTimeline.deleteInstantFile(metadataMetaClient.getFs(),
metadataMetaClient.getMetaPath(), alreadyCompletedInstant.get());
metadataMetaClient.reloadActiveTimeline();
}
// If the alreadyCompletedInstant is empty, that means there is a requested
or inflight
// instant with the same instant time. This happens for data table clean
action which
// reuses the same instant time without rollback first. It is a no-op here
as the
// clean plan is the same, so we don't need to delete the requested and
inflight instant
// files in the active timeline.
} {code}
we missed to rollback partially failed commit in else block.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)