suryaprasanna commented on code in PR #9259:
URL: https://github.com/apache/hudi/pull/9259#discussion_r1271257906


##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieClientOnCopyOnWriteStorage.java:
##########
@@ -2736,8 +2738,12 @@ public void 
testClusteringCommitInPresenceOfInflightCommit() throws Exception {
     assertThrows(HoodieClusteringException.class, () -> 
clusteringWriteClient.cluster(clusteringCommitTime, true));
 
     // Do a rollback on the replacecommit that is failed
-    clusteringWriteClient.rollback(clusteringCommitTime);
+    // clusteringWriteClient.rollback(clusteringCommitTime);

Review Comment:
   Actually, rollback of these clustering commits is handled separately, so it 
will leave out the inflight in the timeline. We use SparkAllowUpdateStrategy so 
only those cases you are allowed to use 
IngestionPrimaryWriterBasedConflictResolutionStrategy, we are using couple of 
approaches to clean these inflights, one by explicitly assigning a rollback for 
the failed commit and another approach is by including replacecommits as part 
of rollbackFailedWrites that way ingestion takes care of clearing them.
   I think we need to make immutable nature of clustering commits as a table 
property i.e. store it in hoodie.properties. That way ingestion knows whether a 
clustering commits can be rolled back or not and accordingly it can either use 
SparkRejectUpdateStrategy or SparkAllowUpdateStrategy implementations and 
cleanup can. be done separately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to