yihua commented on PR #9039: URL: https://github.com/apache/hudi/pull/9039#issuecomment-1604718015
> 2. getOldestInstantToRetainForCompaction also needs to check earliestInstantToRetain like getOldestInstantToRetainForClustering, to ensure that the files related to the compaction instant have been cleaned up before archiving. This is not a bug, and changes around this is unnecessary. The reason is that compaction is different from clustering in the sense that compaction does not add or delete any file group, while clustering generates a replacecommit that replaces existing file groups with new ones, so cleaning has to delete old file groups based on the information from the replacecommit in the active timeline. Even if the compaction commit is archived, the cleaning still behaves properly, as the old file slices that the compaction operation touches can still be identified and deleted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
