yihua commented on code in PR #7035:
URL: https://github.com/apache/hudi/pull/7035#discussion_r1014445796


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/config/HoodieWriteConfig.java:
##########
@@ -1335,10 +1335,14 @@ public boolean isClusteringEnabled() {
     return inlineClusteringEnabled() || isAsyncClusteringEnabled();
   }
 
-  public boolean isRollbackPendingClustering() {
+  public boolean isRollbackPendingClusteringOnConflict() {
     return 
getBoolean(HoodieClusteringConfig.ROLLBACK_PENDING_CLUSTERING_ON_CONFLICT);
   }
 
+  public boolean isRollbackPendingClustering() {
+    return getBoolean(HoodieClusteringConfig.ROLLBACK_PENDING_CLUSTERING);
+  }
+

Review Comment:
   Could you avoid changing the logic of existing APIs and rather adding a new 
API named `isRollbackPendingClusteringWhenDisablingClustering`?



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieWriteClient.java:
##########
@@ -588,6 +588,19 @@ protected void runTableServicesInline(HoodieTable table, 
HoodieCommitMetadata me
         
metadata.addMetadata(HoodieClusteringConfig.SCHEDULE_INLINE_CLUSTERING.key(), 
"true");
         inlineScheduleClustering(extraMetadata);
       }
+
+      // if clustering is disabled, but we might need to rollback any inflight 
clustering when clustering was enabled previously.
+      if (!config.inlineClusteringEnabled() && 
!config.isAsyncClusteringEnabled() && !config.scheduleInlineClustering()
+          && config.isRollbackPendingClustering()) {
+        // rollback any pending clustering
+        
table.getActiveTimeline().filterPendingReplaceTimeline().getInstants().forEach(instant
 -> {
+          Option<Pair<HoodieInstant, HoodieClusteringPlan>> instantPlan = 
ClusteringUtils.getClusteringPlan(table.getMetaClient(), instant);
+          if (instantPlan.isPresent()) {
+            LOG.info("Rolling back pending clustering at instant " + 
instantPlan.get().getLeft());
+            rollback(instant.getTimestamp());

Review Comment:
   This API ignores the pending rollbacks for clustering.  For the following 
case, does it leave an inflight rollback in the timeline indefinitely?
   
   timeline: commit1, commit2, replacecommit3.requested (clustering), 
rollback1.inflight (rolling back replacecommit3, job failed in the middle)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to