jhump commented on issue #6710: URL: https://github.com/apache/iceberg/issues/6710#issuecomment-3482069162
It is unfortunate that this issue was auto-closed just because the right person didn't happen to see it. This is a legitimate issue, and there's another issue: there's not a safe way to delete a snapshot. While we can verify that none of the existing references have been concurrently modified when deleting a snapshot, we cannot verify that no **new* references were added. So the following sequence is possible: 1. Updater 1 loads the table and computes old snapshots to remove based on the set of snapshots and all refs. 2. Concurrently, updater 2 adds a new reference to one of those older snapshots. 3. Updater 1 now issues the call to remove snapshots and incorrectly drops the snapshot for the ref that was just added. > for updateTable, there is UUID check for every commit, that should guarantee the uniqueness of the metadata. Isn't metadata location decided on the server side for each new commit (probably includes UUID in the path)? @agnes-xinyi-lu, the UUID is a unique ID assigned to the _table_, not for each commit. It is the same for lifetime of the table and is not sufficient to determine if any of the table's metadata has changed (other than its UUID; and many catalogs forbid changing the UUID and only allow that update action at table creation). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
