suryaprasanna commented on code in PR #9259:
URL: https://github.com/apache/hudi/pull/9259#discussion_r1271257906
##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/client/functional/TestHoodieClientOnCopyOnWriteStorage.java:
##########
@@ -2736,8 +2738,12 @@ public void
testClusteringCommitInPresenceOfInflightCommit() throws Exception {
assertThrows(HoodieClusteringException.class, () ->
clusteringWriteClient.cluster(clusteringCommitTime, true));
// Do a rollback on the replacecommit that is failed
- clusteringWriteClient.rollback(clusteringCommitTime);
+ // clusteringWriteClient.rollback(clusteringCommitTime);
Review Comment:
Actually, rollback of these clustering commits is handled separately, so it
will leave out the inflight in the timeline. We use SparkAllowUpdateStrategy so
only those cases you are allowed to use
IngestionPrimaryWriterBasedConflictResolutionStrategy, we are using couple of
approaches to clean these inflights, one by explicitly assigning a rollback for
the failed commit and another approach is by including replacecommits as part
of rollbackFailedWrites that way ingestion takes care of clearing them.
I think we need to make immutable nature of clustering commits as a table
property i.e. store it in hoodie.properties. That way ingestion knows whether a
clustering commits can be rolled back or not and accordingly it can either use
SparkRejectUpdateStrategy or SparkAllowUpdateStrategy implementations and
cleanup can. be done separately.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]