vinothchandar commented on a change in pull request #3210:
URL: https://github.com/apache/hudi/pull/3210#discussion_r687032405
##########
File path:
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieBackedMetadata.java
##########
@@ -480,6 +482,131 @@ public void testRollbackUnsyncedCommit(HoodieTableType
tableType) throws Excepti
client.syncTableMetadata();
validateMetadata(client);
}
+
+ // If an unsynced commit is automatically rolled back during next commit,
the rollback commit gets a timestamp
+ // greater than than the new commit which is started. Ensure that in this
case the rollback is not processed
+ // as the earlier failed commit would not have been committed.
+ //
+ // Dataset: C1 C2 C3.inflight[failed] C4 R5[rolls
back C3]
+ // Metadata: C1.delta C2.delta
+ //
+ // When R5 completes, C3.xxx will be deleted. When C4 completes, C4 and R5
will be committed to Metadata Table in
Review comment:
On this one, I continue to disagree :). We can easily do a logger.warn
or error and collect information in an automated fashion, to fix the issues. In
the current model, heavy user intervention is needed to fix the metadata table
and then get the pipeline back online. This is not very user-friendly for folks
in OSS.
I am thinking of adding an internal config to make this behavior
configurable, so you can have it failing hard in Uber and we can do the other
way in OSS. It ll actually help us see more variety of issues and harden the
implementation. wdyt?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]