danny0405 commented on code in PR #11440:
URL: https://github.com/apache/hudi/pull/11440#discussion_r1680216558


##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/clean/CleanPlanActionExecutor.java:
##########
@@ -149,17 +150,24 @@ HoodieCleanerPlan requestClean(HoodieEngineContext 
context) {
           .map(x -> new HoodieActionInstant(x.getTimestamp(), x.getAction(), 
x.getState().name())).orElse(null),
           planner.getLastCompletedCommitTimestamp(),
           config.getCleanerPolicy().name(), Collections.emptyMap(),
-          CleanPlanner.LATEST_CLEAN_PLAN_VERSION, cleanOps, 
partitionsToDelete, prepareExtraMetadata(planner.getSavepointedTimestamps()));
+          CleanPlanner.LATEST_CLEAN_PLAN_VERSION, cleanOps, 
partitionsToDelete, prepareExtraMetadata(planner.getSavepointedTimestamps(), 
planner.getEarliestSavepoint()));
     } catch (IOException e) {
       throw new HoodieIOException("Failed to schedule clean operation", e);
     }
   }
 
-  private Map<String, String> prepareExtraMetadata(List<String> 
savepointedTimestamps) {
-    if (savepointedTimestamps.isEmpty()) {
+  private Map<String, String> prepareExtraMetadata(List<String> 
savepointedTimestamps, Option<String> earliestSavepoint) {
+    if (savepointedTimestamps.isEmpty() && !earliestSavepoint.isPresent()) {
       return Collections.emptyMap();
     } else {
-      return Collections.singletonMap(SAVEPOINTED_TIMESTAMPS, 
savepointedTimestamps.stream().collect(Collectors.joining(",")));
+      Map<String, String> extraMetadata = new HashMap<>();
+      if (!savepointedTimestamps.isEmpty()) {
+        extraMetadata.put(SAVEPOINTED_TIMESTAMPS, 
savepointedTimestamps.stream().collect(Collectors.joining(",")));

Review Comment:
   Because we already remember the savepointed timestamps(let's name it as 
`s1`) and `earliestCommitToRetain` in the clean metadata, in archiver,
   let's just check all the instants(name it as `s2`) within the range 
[min(s1), earliestCommitToRetain], if `s2` contains any replace_commits, then 
guard the archival with min(s1), otherwise the archiver is good to go.
   
   And there is no need to put any extra metadata in the clean metadata 
actually.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to