amogh-jahagirdar commented on a change in pull request #4006:
URL: https://github.com/apache/iceberg/pull/4006#discussion_r796078739
##########
File path: core/src/main/java/org/apache/iceberg/RemoveSnapshots.java
##########
@@ -155,19 +154,88 @@ private TableMetadata internalApply() {
this.base = ops.refresh();
Set<Long> idsToRetain = Sets.newHashSet();
- List<Long> ancestorIds = SnapshotUtil.ancestorIds(base.currentSnapshot(),
base::snapshot);
- if (minNumSnapshots >= ancestorIds.size()) {
- idsToRetain.addAll(ancestorIds);
- } else {
- idsToRetain.addAll(ancestorIds.subList(0, minNumSnapshots));
+ Map<String, SnapshotRef> references = base.refs();
+ long currentTime = System.currentTimeMillis();
+ Map<String, SnapshotRef> referencesToRetain = base.refs()
+ .entrySet()
+ .stream()
+ .filter(refEntry -> refEntry.getKey().equals(SnapshotRef.MAIN_BRANCH)
||
+ refEntry.getValue().maxRefAgeMs() == null ||
+ currentTime - refEntry.getValue().timestampMillis() <
refEntry.getValue().maxRefAgeMs())
+ .collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
+
+ //All snapshots should be retained
+ if (globalMinSnapshots >= base.snapshots().size()) {
+ idsToRetain.addAll(base.snapshots().stream().map(snapshot ->
snapshot.snapshotId()).collect(Collectors.toList()));
+ }
+ else {
+ List<SnapshotRef> refs = Lists.newArrayList(references.values());
+ for (SnapshotRef ref : refs) {
+ if (ref.type().equals(SnapshotRefType.BRANCH)) {
+ Snapshot startingSnapshot = base.snapshot(ref.snapshotId());
Review comment:
Here we are looping over all refs including ones that could have been
expired. Then when evaluating retention we use the branch policy. Although, the
ref is expired. If we want the default to be that we fall back to global policy
if the reference itself is expired, we should update this logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]