bvaradar commented on code in PR #7891:
URL: https://github.com/apache/hudi/pull/7891#discussion_r1121217637


##########
hudi-common/src/main/java/org/apache/hudi/common/util/ClusteringUtils.java:
##########
@@ -250,15 +253,24 @@ public static Option<HoodieInstant> 
getOldestInstantToRetainForClustering(
                         ? cleanInstant
                         : 
HoodieTimeline.getCleanRequestedInstant(cleanInstant.getTimestamp()))
                 .getEarliestInstantToRetain().getTimestamp();
-        return StringUtils.isNullOrEmpty(earliestCommitToRetain)
-            ? Option.empty()
-            : replaceTimeline.filter(instant ->
-                HoodieTimeline.compareTimestamps(instant.getTimestamp(),
-                    HoodieTimeline.GREATER_THAN_OR_EQUALS,
-                    earliestCommitToRetain))
-            .firstInstant();
+        if (!StringUtils.isNullOrEmpty(earliestCommitToRetain)) {
+          oldestInstantToRetain = replaceTimeline.filterCompletedInstants()
+              .filter(instant -> 
HoodieTimeline.compareTimestamps(instant.getTimestamp(), 
HoodieTimeline.GREATER_THAN_OR_EQUALS, earliestCommitToRetain))
+              .firstInstant();
+        }
+      }
+      Option<HoodieInstant> pendingInstantOpt = 
replaceTimeline.filterInflights().firstInstant();
+      if (pendingInstantOpt.isPresent()) {
+        // Get the previous commit before the first inflight clustering 
instant.
+        Option<HoodieInstant> beforePendingInstant = 
activeTimeline.filterCompletedInstants()

Review Comment:
   Shouldn't this be 
activeTImeline.getCommitsTimeline().filterCompletedInstants() to only include 
DeltaCommit, Commit and Replace Commit ?



##########
hudi-client/hudi-spark-client/src/test/java/org/apache/hudi/io/TestHoodieTimelineArchiver.java:
##########
@@ -1400,6 +1400,28 @@ public void testArchivalAndCompactionInMetadataTable() 
throws Exception {
     }
   }
 
+  @ParameterizedTest
+  @ValueSource(booleans = {true, false})
+  public void testPendingClusteringAfterArchiveCommit(boolean enableMetadata) 
throws Exception {
+    HoodieWriteConfig writeConfig = 
initTestTableAndGetWriteConfig(enableMetadata, 2, 5, 2);
+    // 
timeline:0000000(completed)->00000001(completed)->00000002(replace&inflight)->00000003(completed)->...->00000007(completed)
+    HoodieTestDataGenerator.createPendingReplaceFile(basePath, "00000002", 
wrapperFs.getConf());
+    for (int i = 1; i < 8; i++) {
+      if (i != 2) {
+        testTable.doWriteOperation("0000000" + i, WriteOperationType.UPSERT, 
Arrays.asList("p1", "p2"), Arrays.asList("p1", "p2"), 2);

Review Comment:
   Minor Comment: Should the WriteOperationType be one of cluster, 
insert_overwrite_table, insert_overwrite or delete_partition to make sense ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to