mzzz-zzm opened a new pull request, #982:
URL: https://github.com/apache/iceberg-go/pull/982

   fix(occ): rebuild manifest list on OCC retry to inherit concurrent writes
   
   Fixes #976
   ## Problem
   
   When `doCommit` retries after an OCC conflict, it re-uses the stale snapshot 
that was built before the first attempt. That snapshot's manifest list was 
written against the original parent and does not include any data files 
committed by concurrent writers in the meantime.
   
   On catalogs that perform server-side snapshot validation (e.g. AWS S3 
Tables), this causes **silent data loss**: the retried commit succeeds (HTTP 
200) but the stale manifest list effectively drops the concurrent writer's 
files from the table history.
   
   ## Fix
   
   Each `snapshotProducer` now records which manifests it wrote itself 
(`ownManifests` — those not inherited from the original parent). A 
`rebuildManifestList` closure is attached to the `addSnapshotUpdate` and called 
by `doCommit` on every retry. The closure:
   
   1. Loads the fresh branch head's manifest list (concurrent manifests).
   2. Concatenates `ownManifests` with the fresh inherited manifests.
   3. Writes a new manifest list file with a retry-attempt suffix so each 
attempt gets a unique path.
   4. Adjusts `sequence-number` relative to the fresh parent.
   
   After a successful commit, `doCommit` deletes the manifest list files 
produced by superseded retry attempts (orphan cleanup).
   
   ## Files changed
   
   - `table/snapshot_producers.go`: `computeOwnManifests`, `rebuildFn` closure
   - `table/updates.go`: `ownManifests` / `rebuildManifestList` fields on 
`addSnapshotUpdate`, propagated through `Apply`
   - `table/table.go`: `rebuildSnapshotUpdates` helper, orphan cleanup after 
commit
   - `table/rebuild_manifest_test.go`: unit tests for `rebuildSnapshotUpdates`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to