rdblue commented on pull request #4071: URL: https://github.com/apache/iceberg/pull/4071#issuecomment-1035345999
This seems much like the `ManageSnapshots` API to me. I'm wondering how we should proceed. Background on `ManageSnapshots`: this is the API that we use to cherry-pick snapshots and do rollbacks. These high-level operations are really similar to what's happening here. A rollback is setting the snapshot for the main branch, a cherry-pick moves a snapshot between branches (if possible), and you can also set the current snapshot. One thing that I've wanted to do with this API for a while is to make it possible to perform multiple changes as a transaction. That is, pick more than one snapshot and then commit. I think these changes might fit well into `ManageSnapshots`. This could enable doing things like `createBranch` and then cherry-pick into that branch. Plus, we'll need to update `ManageSnapshots` for branching and tagging anyway. Things like cherry-pick to a branch other than `main` and cherry-picking a snapshot by tag rather than by ID. Also, one of the concerns I have with the current PR is the set of operations that we're exposing. For example, doesn't make sense to me to just have `branch` without some associated verb, like `createBranch`, `replaceBranch`, or `removeBranch`. So what I propose is to add these changes to `ManageSnapshots` and extend the API like this: * Add `createBranch`, `replaceBranch`, `removeBranch`, and `renameBranch` * Add `createTag`, `replaceTag` and `removeTag` (not `renameTag` because it doesn't make much sense, right?) * Add `rollbackTo(String branchName, long snapshotId)` * Add `cherrypick(String branchName, long snapshotId)` Then there are a couple of other methods that we might want to add: * Add `cherrypick(String branchName, String tagName)` * Add `cherrypickAll(String branchName, String branchName)` The `cherrypickAll` method would require us turning this into a transaction and not just a `SnapshotUpdate`, though. That could require us to do some of these things separately and maybe even deprecate `ManageSnapshots` and come up with a different name for the combined `ManageSnapshots` and `UpdateSnapshotRefs` interface. (Sorry this idea isn't fully baked yet!) What do you think, @amogh-jahagirdar and @jackye1995? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
