amogh-jahagirdar commented on pull request #4071:
URL: https://github.com/apache/iceberg/pull/4071#issuecomment-1054890263
Sorry for the late update on this! @rdblue @jackye1995 when you get a chance
let me know your thoughts. I still see some points where we want to do more
validation but wanted to get your takes on the API, and operation structure.
After we get clarity there, then I'll write unit tests. Some points:
1.) There is no replaceBranch/replaceTag in this API. From looking at
previous threads and offline discussions, the overall desire for "replace" for
just metadata operations is to update retention properties. So in this,
replace* methods are named to setBranch/setTag retention respectively since I
think that name makes more sense, but let me know your thoughts.
If we want replace to mean take an existing branch and update it to a new
snapshot or create a new branch if one does not exist, a user could do
drop/create but it's more clunky that way. As @rdblue mentioned, it seems like
the main use case is to allow for someone to write data to a branch, and if the
branch doesn't exist we create it because performing the write is expensive.
Unless I'm mistaken, this would more related to SnapshotProducer changes rather
than Metadata changes?
Let me know your thoughts.
2.) The package private operation is just a single operation
(ManageSnapshotReferenceOperation) which encapsulates new additions to refs,
ref removals. Originally, i had multiple operations but it seemed kind of
redundant/overkill to have different internal operations compared to having a
single operation just maintain state. Let me know your preference.
3.) For rename, updating retention, these map to multiple MetadataUpdate. We
don't have a single "RenameBranchUpdate", this will map to
RemoveSnapshotRefUpdate + CreateSnapshotRefUpdate (with the new name). In our
operation implementation we ensure the ordering of these (as it seems like with
REST Catalog + this Metadata Update approach, the client will expect an ordered
list of these updates). My only doubt here is it seems inefficient to have
multiple updates for a rename/update retention.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]