amogh-jahagirdar commented on pull request #4071:
URL: https://github.com/apache/iceberg/pull/4071#issuecomment-1054890263


   Sorry for the late update on this! @rdblue @jackye1995 when you get a chance 
let me know your thoughts. I still see some points where we want to do more 
validation but wanted to get your takes on the API, and operation structure. 
After we get clarity there, then I'll write unit tests. Some points: 
   
   1.) There is no replaceBranch/replaceTag in this API. From looking at 
previous threads and offline discussions, the overall desire for "replace" for 
just metadata operations is to update retention properties. So in this, 
replace* methods are named to setBranch/setTag retention respectively since I 
think that name makes more sense, but let me know your thoughts.
    If we want replace to mean take an existing branch and update it to a new 
snapshot or create a new branch if one does not exist, a user could do 
drop/create but it's more clunky that way. As @rdblue mentioned, it seems like 
the main use case is to allow for someone to write data to a branch, and if the 
branch doesn't exist we create it because performing the write is expensive. 
Unless I'm mistaken, this would more related to SnapshotProducer changes rather 
than Metadata changes?
   Let me know your thoughts.
   
   2.) The package private operation is just a single operation 
(ManageSnapshotReferenceOperation) which encapsulates new additions to refs, 
ref removals. Originally, i had multiple operations but it seemed kind of 
redundant/overkill to have different internal operations compared to having a 
single operation just maintain state. Let me know your preference.
   
   3.) For rename, updating retention, these map to multiple MetadataUpdate. We 
don't have a single "RenameBranchUpdate", this will map to 
RemoveSnapshotRefUpdate + CreateSnapshotRefUpdate (with the new name). In our 
operation implementation we ensure the ordering of these (as it seems like with 
REST Catalog + this Metadata Update approach, the client will expect an ordered 
list of these updates). My only doubt here is it seems inefficient to have 
multiple updates for a rename/update retention.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to