amogh-jahagirdar commented on code in PR #4578:
URL: https://github.com/apache/iceberg/pull/4578#discussion_r851877024


##########
core/src/main/java/org/apache/iceberg/RemoveSnapshots.java:
##########
@@ -75,32 +77,47 @@ public void accept(String file) {
 
   private final TableOperations ops;
   private final Set<Long> idsToRemove = Sets.newHashSet();
+  private final long removalTimeReference;
   private boolean cleanExpiredFiles = true;
   private TableMetadata base;
-  private long expireOlderThan;
-  private int minNumSnapshots;
+  private Long expireOlderThan;
+  private Integer minNumSnapshots;
   private Consumer<String> deleteFunc = defaultDelete;
   private ExecutorService deleteExecutorService = 
DEFAULT_DELETE_EXECUTOR_SERVICE;
   private ExecutorService planExecutorService = ThreadPools.getWorkerPool();
 
   RemoveSnapshots(TableOperations ops) {
     this.ops = ops;
     this.base = ops.current();
-
-    long maxSnapshotAgeMs = PropertyUtil.propertyAsLong(
-        base.properties(),
-        MAX_SNAPSHOT_AGE_MS,
-        MAX_SNAPSHOT_AGE_MS_DEFAULT);
-    this.expireOlderThan = System.currentTimeMillis() - maxSnapshotAgeMs;
-
-    this.minNumSnapshots = PropertyUtil.propertyAsInt(
-        base.properties(),
-        MIN_SNAPSHOTS_TO_KEEP,
-        MIN_SNAPSHOTS_TO_KEEP_DEFAULT);
-
     ValidationException.check(
         PropertyUtil.propertyAsBoolean(base.properties(), GC_ENABLED, 
GC_ENABLED_DEFAULT),
         "Cannot expire snapshots: GC is disabled (deleting files may corrupt 
other tables)");
+
+    SnapshotRef mainBranch = base.ref(SnapshotRef.MAIN_BRANCH);
+    Long maxSnapshotAgeMs = null;
+
+    if (mainBranch != null) {
+      maxSnapshotAgeMs = mainBranch.maxSnapshotAgeMs();
+    }
+    if (maxSnapshotAgeMs == null) {
+      maxSnapshotAgeMs = PropertyUtil.propertyAsLong(
+              base.properties(),
+              MAX_SNAPSHOT_AGE_MS,
+              MAX_SNAPSHOT_AGE_MS_DEFAULT);
+    }

Review Comment:
   This preserves the behavior that if there is no max snapshot age defined on 
the main branch, the procedure will use table level property, and if that's 
still not defined, the procedure will use the default property. This was simply 
done to accommodate whatever existing tests we had which validate that the 
default is used.
   
    However, I'm not sure if this is the behavior we actually want. If there's 
null set on the main branch for maxSnapshotAgeMs, or for minNUmSnapshotsToKeep, 
it would make more sense for the procedure to treat that as nothing on main 
should be expired. Then at this point, the table properties would not matter 
for the expiration logic. Everything would have to be done via branch retention 
properties. 
   
   Then to preserve existing procedure behavior, the main branch would have to 
have the default retention properties?
    
    @rdblue thoughts?
    
    



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to