Re: [I] feat(table): preserve row lineage through copy-on-write and merge-on-read [iceberg-go]

via GitHub Sun, 17 May 2026 10:28:48 -0700


laskoviymishka commented on issue #999:
URL: https://github.com/apache/iceberg-go/issues/999#issuecomment-4471781003


   I’d fold PR (2) into PR (1). A standalone PR for one test plus a comment on 
an unchanged line feels a little too small, it creates reviewer context-switch 
without much benefit.
   
   You’ll need that test anyway to prove the CoW rewrite is correct end-to-end, 
so I’d include it there: add the regression test in PR (1), and leave a short 
comment near `snapshot_producers.go:920` explaining that the overcount is 
intentional.
   
   So I’d make it two PRs total:
   
   1. CoW rewrite + accounting regression test/comment
   2. MoR insert/reinsert API
   
   Overall, the approach looks right to me.
   
   On Q1: agreed, `WithPreserveRowLineage()` keeping the snapshot producer 
unaware seems like the right shape, and it reuses `synthesizeRowLineageColumns` 
cleanly.
   
   On Q2: good catch. Preserved IDs are always below `firstRowID`, so advancing 
past the full manifest count can’t collide; it just leaves a hole. Java does 
the same thing, so this is fine.
   
   On Q3: insert/reinsert matching `SparkPositionDeltaWrite` makes sense. 
Nulling `_last_updated_sequence_number` on reinsert is also correct: the new 
file’s `data_sequence_number` will be the new snapshot’s seq#, so inheritance 
lands in the right place.
   
   One small sanity check before starting PR (1): can you confirm that 
`synthesizeRowLineageColumns` reads the file’s sequence number on the synthesis 
path, rather than anything cached from the source snapshot? If yes, then 
nulling on reinsert should behave as intended.
   
   Let's 🚢  it.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] feat(table): preserve row lineage through copy-on-write and merge-on-read [iceberg-go]

Reply via email to