SteNicholas commented on PR #7405:
URL: https://github.com/apache/hudi/pull/7405#issuecomment-1365896759
@zhuanshenbsj1, +1. I have modified the
`getOldestInstantToRetainForClustering` of the above patch with the following
to take the case you mentioned:
```
/**
* Checks whether the latest clustering instant has a subsequent cleaning
action. Returns
* the clustering instant if there is such cleaning action or empty.
*
* @param activeTimeline The active timeline
* @return the oldest instant to retain for clustering
*/
public static Option<HoodieInstant>
getOldestInstantToRetainForClustering(HoodieActiveTimeline activeTimeline)
throws IOException {
Option<HoodieInstant> cleanInstant =
activeTimeline.getCleanerTimeline().filter(instant ->
!instant.isCompleted()).firstInstant();
if (cleanInstant.isPresent()) {
// The first clustering instant of which timestamp is greater than or
equal to the earliest commit to retain of
// the clean metadata.
HoodieCleanMetadata cleanMetadata =
TimelineMetadataUtils.deserializeHoodieCleanMetadata(
activeTimeline.getInstantDetails(cleanInstant.get()).get());
return activeTimeline.getCompletedReplaceTimeline()
.filter(instant ->
HoodieTimeline.compareTimestamps(
instant.getTimestamp(),
HoodieTimeline.GREATER_THAN_OR_EQUALS,
cleanMetadata.getEarliestCommitToRetain()))
.firstInstant();
}
return Option.empty();
}
```
PTAL.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]