amogh-jahagirdar commented on code in PR #4578:
URL: https://github.com/apache/iceberg/pull/4578#discussion_r864396842
##########
core/src/main/java/org/apache/iceberg/RemoveSnapshots.java:
##########
@@ -161,21 +173,94 @@ public List<Snapshot> apply() {
private TableMetadata internalApply() {
this.base = ops.refresh();
+ if (cleanExpiredFiles && base.refs().size() > 1) {
+ throw new UnsupportedOperationException(
+ "Cannot perform snapshot expiration and have clean files set when
there is more than 1 ref");
+ }
Review Comment:
My doubt here is that say we have the following commit graph
A - B - C
` - D
A - B - C is main, A - D is test
If table.expireSnapshots().cleanExpiredFiles(false).commit() and A expires.
Let's also say the test branch reference also expires. Then we have the
following disconnected graph.
B - C <main>
D
If a user now does
table.expireSnapshots().cleanExpiredFiles(**true**).commit(), and say B should
be expired. There's no reference to D but because we are decoupling ref aging
from the snapshot aging it could still be retained. we would allow the removal
of files that are in B but not in C. However those files could still be in D,
so it's an unsafe deletion.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]