[GitHub] [hudi] ksmou commented on a diff in pull request #8944: [HUDI-6359]Spark offline compaction/clustering will never rollback when both requested and inflight states exist

via GitHub Fri, 16 Jun 2023 02:27:56 -0700


ksmou commented on code in PR #8944:
URL: https://github.com/apache/hudi/pull/8944#discussion_r1232005192



##########
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieClusteringJob.java:
##########
@@ -209,8 +209,7 @@ private int doCluster(JavaSparkContext jsc) throws 
Exception {
         // Instant time is not specified
         // Find the earliest scheduled clustering instant for execution
         Option<HoodieInstant> firstClusteringInstant =
-            metaClient.getActiveTimeline().firstInstant(
-                HoodieTimeline.REPLACE_COMMIT_ACTION, 
HoodieInstant.State.REQUESTED);
+            
metaClient.getActiveTimeline().filterPendingReplaceTimeline().firstInstant();

Review Comment:
   We already support align the clustering job with the write client behavior. 
But when the write traffic is too large, we need to start an independent job to 
schedule the compaction or clustering plan.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] ksmou commented on a diff in pull request #8944: [HUDI-6359]Spark offline compaction/clustering will never rollback when both requested and inflight states exist

Reply via email to