Adding Matt Topol to the discussion.

Thanks,
Krutika

On Fri, Feb 6, 2026 at 3:55 PM Krutika Dhananjay <[email protected]> wrote:

> Hi everyone,
>
> I'd like to bring up a race condition between snapshot expiration and
> concurrent client commits that add refs to snapshots.
>
> *The Problem*
> Consider a table with snap-1 (unreferenced) and snap-2 (main). Two
> operations happen concurrently:
>
> 1. A maintenance job identifies snap-1 as eligible for expiration.
> 2. A client commits SetSnapshotRef("feature-branch", snap-1).
>
> If the client commits first, the maintenance job loads the fresh metadata
> (which now contains a ref for feature-branch -> snap-1), builds its
> RemoveSnapshots([snap-1]) update, and commits. The CAS succeeds because the
> maintenance job has the latest metadata pointer. But the resulting metadata
> is corrupt: feature-branch points to a deleted snapshot.
>
> CAS-based optimistic concurrency protects against concurrent pointer
> updates, but it does not validate the semantic correctness of the operation
> against the current state. The maintenance job's commit is structurally
> valid (latest pointer) but semantically invalid (removes a referenced
> snapshot).
>
> *Proposed Fix in iceberg-go*
> We addressed <https://github.com/apache/iceberg-go/pull/716> this in
> iceberg-go by modifying MetadataBuilder.RemoveSnapshots() to skip removing
> any snapshot that is currently referenced by a branch or tag. This works
> correctly -- the validation happens against the same metadata state that
> the CAS will verify hasn't changed.
> However, this changes the existing behavior of RemoveSnapshots, which
> currently removes the snapshot unconditionally and cleans up any refs that
> pointed to it. This would be a deviation from the behaviour in iceberg-java
> and iceberg-rs.
>
> *Proposed Alternative: AssertSnapshotNotReferenced*
> A less invasive approach that preserves the existing RemoveSnapshots
> behavior would be to introduce a new assertion update --
> AssertSnapshotNotReferenced. This would be analogous to existing
> assertions, but for validating that a snapshot has no refs pointing to it.
>
> During commit, a catalog-side maintenance job would include this assertion
> alongside its RemoveSnapshots update:
>
> updates: [
>       AssertSnapshotNotReferenced(snap-1),
>       RemoveSnapshots([snap-1])
> ]
>
> The assertion would be validated against the freshly loaded metadata
> during commit. If a concurrent client had linked a ref to snap-1 in the
> meantime, the assertion fails, the CAS-based commit is never attempted, and
> the maintenance job can retry with updated knowledge of the table state.
>
> This approach:
> - Does not change existing RemoveSnapshots semantics
> - Can be implemented uniformly across Java, Rust, and Go
>
> Would love to hear thoughts on the preferred way to handle this.
>
> Thanks,
> Krutika
>

Reply via email to