danny0405 commented on code in PR #13606:
URL: https://github.com/apache/hudi/pull/13606#discussion_r2228476012
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/timeline/versioning/v2/TimelineArchiverV2.java:
##########
@@ -256,6 +260,25 @@ private List<HoodieInstant> getCommitInstantsToArchive()
throws IOException {
earliestInstantToRetainCandidates.add(qualifiedEarliestInstant);
}
+ // 6. If archival should consider `earliest retain instant` in the clean
plan,
+ // we should add the earliest retain instant from the clean plan to the
candidates.
+ if (config.shouldArchiveKeepCleanPlanRetainInstant()) {
Review Comment:
`retain in clean plan` is just a marker for left boundary of the incremental
cleaning. It is not the constraint for archving for general, this variable is
mainly introduced to resolve clustering data duplication issues as in the doc:
"the clustering instant won't be archived before cleaned, and the earliest
inflight clustering instant has a previous commit"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]