Hi Prashant, Sorry for the delayed reply and apologies if I missed some relevant discussion.
As I understand the catalog could remove snapshots that come in-between previous and current snapshots from the perspective of one of the clients. Can we be sure that the removed snapshot does not have material data changes (e.g. new roes or updated rows) that should have been taken into account by the client whose snapshot is forced to become "current". Could this result in data loss? Thanks, Dmitri. On 2025/03/31 22:44:03 Prashant Singh wrote: > Hey folks, > > I wanted to propose this feature to Apache Polaris Rolling back > replacements operation snapshots in the case during the concurrent write > (compaction and other writers trying to commit to the table at the same > time) to Iceberg there are conflicts. This is a feature which Ryan proposed > as an alternative when I was proposing a Priority Amongst Writer proposal > [1] in the Apache Iceberg community. This kind of makes the compaction > always a low priority process. > > Earlier, I went ahead and added this feature as a client side change in the > Apache Iceberg repo [2] . It got some attraction but this didn't get to the > end. Now when we think more about it again Apache Polaris seems to be the > best place to do it as it can benefit other language writer clients as well > and Polaris is the one to actually apply the commits based on the > requirements and update sent by Iceberg Rest Client. > > Here is my draft PR [3] on how I think this can be achieved, given this is > enabled by a table property, happy to discuss other knobs for ex: maybe > check the snapshot prop ? > > The logic essentially if we see is the base (B) on which the snapshot we > want to include/commit is based on is changed to something like (B`) and > the given snapshot from B` to B are all of ops type *REPLACE *. It adds > other updates within the same update Table req > 1. moved the snapshot ref to B > 2. [Optional] to remove the snapshot between B` to B given its all of > *REPLACE*. > Then try the requirements and updates again on the updated base and see if > it succeeds. To make all this as part of one updateReq and then commit to > the table. > Doing it this way preserves the schema changes for which no new snapshot > has been created, just a new metadata.json is created. > > Happy to know your thoughts on the same. > > Links: > [1] > https://docs.google.com/document/d/1pSqxf5A59J062j9VFF5rcCpbW9vdTbBKTmjps80D-B0/edit?tab=t.0#heading=h.fn6jmpw6phpn > [2] https://github.com/apache/iceberg/pull/5888 > [3] https://github.com/apache/polaris/pull/1285 > > Best, > Prashant Singh >