rdblue commented on pull request #4071: URL: https://github.com/apache/iceberg/pull/4071#issuecomment-1046292111
> Going back to the cherry-picking related operations, the fundamental difference of it from the other ones is that it produces a new snapshot with a new manifest list based on the cherry-picked snapshot information. What exactly is our semantics for cherry-picking? This is a great question. Right now, cherry picking is not like git. Git cherry picking re-applies a diff, but makes no guarantee about the semantic changes. Iceberg cherry picking (so far) gives the same semantics. That's why we currently only support append and overwrites that are replace partitions. For those, we can check that the changes can still be safely applied by re-validating the commit checks. For append, there are no checks so it is safe. For replace partitions, we validate that no new files have been added to the replaced partitions. Iceberg doesn't currently support cherry-picking snapshots that require knowing more about the original operation. For example, DeleteFiles using a filter would need to store the filter and validate that no other data files were added that match the delete filter. Or maybe it would just run the delete filter again. My plan is to add these when people start asking for them. For now, I think cherrypickAll would just run `cherrypick` in a loop. The main benefit would be keeping track of where the branch diverged. But this is probably more advanced than we need for now. In the short term, I'd focus on getting the branching and tagging parts in. Cherry picking a branch is something we can add later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
