Anton-Tarazi commented on issue #2604: URL: https://github.com/apache/iceberg-python/issues/2604#issuecomment-3395335348
I don't know how the java implementation does it, but if we delete the data files _after_ the new metadata is committed that wouldn't cause a long-running transaction. Other processes would be free to write to the table, while the `expire_snapshots` process continues deleting the relevant data files. Doing it in this order is fine since those files are orphaned from the table. (Once #1958 is merged one could just call `remove_orphan_files` after the `expire_snapshots` and the result would be the same, but I think its valuable to have `expire_snapshots` be consistent with the java version). Making this opt-in seems reasonable. I think if we're gonna deviate from the spec its better to make it an argument rather than a table property. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
