zhuanshenbsj1 commented on code in PR #7405:
URL: https://github.com/apache/hudi/pull/7405#discussion_r1058708650


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanner.java:
##########
@@ -188,8 +188,12 @@ private List<String> 
getPartitionPathsForIncrementalCleaning(HoodieCleanMetadata
         + "since last cleaned at " + cleanMetadata.getEarliestCommitToRetain()
         + ". New Instant to retain : " + newInstantToRetain);
     return 
hoodieTable.getCompletedCommitsTimeline().getInstantsAsStream().filter(
-        instant -> HoodieTimeline.compareTimestamps(instant.getTimestamp(), 
HoodieTimeline.GREATER_THAN_OR_EQUALS,
-            cleanMetadata.getEarliestCommitToRetain()) && 
HoodieTimeline.compareTimestamps(instant.getTimestamp(),
+        instant -> (HoodieTimeline.compareTimestamps(instant.getTimestamp(), 
HoodieTimeline.GREATER_THAN_OR_EQUALS,
+            cleanMetadata.getEarliestCommitToRetain())
+              || (instant.getMarkerFileModificationTimestamp().isPresent()

Review Comment:
   > If an out-of-order replace commit finished before the clean start and the 
instant time of the replace commit is before the earliest commit to retain, it 
won't be cleaned and left in the timeline. Archiver will then archive it since 
it's last modified time is earlier than the last clean in the timeline. What do 
you think?
   
   You are right,it still won't clean the clustering instant in this scenario.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to