amogh-jahagirdar commented on code in PR #4578:
URL: https://github.com/apache/iceberg/pull/4578#discussion_r863901394
##########
core/src/main/java/org/apache/iceberg/RemoveSnapshots.java:
##########
@@ -75,32 +79,53 @@ public void accept(String file) {
private final TableOperations ops;
private final Set<Long> idsToRemove = Sets.newHashSet();
+ private final long now;
+ private final long defaultMaxRefAgeMs;
private boolean cleanExpiredFiles = true;
private TableMetadata base;
private long expireOlderThan;
- private int minNumSnapshots;
+ private Integer minNumSnapshots;
private Consumer<String> deleteFunc = defaultDelete;
private ExecutorService deleteExecutorService =
DEFAULT_DELETE_EXECUTOR_SERVICE;
private ExecutorService planExecutorService = ThreadPools.getWorkerPool();
RemoveSnapshots(TableOperations ops) {
this.ops = ops;
this.base = ops.current();
-
- long maxSnapshotAgeMs = PropertyUtil.propertyAsLong(
- base.properties(),
- MAX_SNAPSHOT_AGE_MS,
- MAX_SNAPSHOT_AGE_MS_DEFAULT);
- this.expireOlderThan = System.currentTimeMillis() - maxSnapshotAgeMs;
-
- this.minNumSnapshots = PropertyUtil.propertyAsInt(
- base.properties(),
- MIN_SNAPSHOTS_TO_KEEP,
- MIN_SNAPSHOTS_TO_KEEP_DEFAULT);
-
ValidationException.check(
PropertyUtil.propertyAsBoolean(base.properties(), GC_ENABLED,
GC_ENABLED_DEFAULT),
"Cannot expire snapshots: GC is disabled (deleting files may corrupt
other tables)");
+
+ SnapshotRef mainBranch = base.ref(SnapshotRef.MAIN_BRANCH);
+ Long maxSnapshotAgeMs = null;
+
+ if (mainBranch != null) {
+ maxSnapshotAgeMs = mainBranch.maxSnapshotAgeMs();
+ minNumSnapshots = mainBranch.minSnapshotsToKeep();
Review Comment:
Yeah I had a comment for fixing this below, this current logic for the max
snapshot age, and min snapshots to keep isn't right. Here we can set the
default retention properties from the table level. Then when we go to compute
which branch snapshots to retain we would only fall back to these if the branch
properties aren't set.
When determining which retention property for min snapshots/max snapshot age
to use we don't need to distinguish between main and other branches. In either
case, we would fall back to using the table property if it's missing for the
branch. As you mentioned the only differentiation with main is that the ref
shouldn't be aged out.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]