gtrettenero opened a new issue, #14402: URL: https://github.com/apache/iceberg/issues/14402
### Feature Request / Improvement I was thinking about the default lifetimes when creating branches. Whenever branches have the default lifetime, `history.expire.max-ref-age-ms=Long.MAX_VALUE (forever)`. Depending on the user's workflow, there will be cases when the branch will become diverged from 'main' because the original snapshot from 'main' where the branch was initially created from has been expired. In these cases, the branch can never be merged into 'main' since we'd be missing the ancestry. Currently, one way around this if we want to use the diverged branch's snapshots is using the `set_current_snapshot` procedure. However, the procedure I'm proposing will iterate through the snapshots belonging to non-'main' branches and if the snapshots don't have a common ancestor with 'main', then the branches will get dropped. This would be a simpler cleanup procedure for tables that have several engineers creating branches and prevents them from being accumulated. This would be used in cases where users don't want a 'time-based' cleanup for their branches with `max-ref-age-ms`. I wanted to get some thoughts from OSS folks if they think adding this procedure would be useful. ### Query engine Spark ### Willingness to contribute - [x] I can contribute this improvement/feature independently - [ ] I would be willing to contribute this improvement/feature with guidance from the Iceberg community - [ ] I cannot contribute this improvement/feature at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
