tanmayrauth commented on issue #997:
URL: https://github.com/apache/iceberg-go/issues/997#issuecomment-4488900751
Thanks for the review! Agreed on both gaps:
1. In-snapshot DV merging: I'll scope PR (1)/(2) to
one-DV-per-data-file-per-Flush and file a follow-up for the merge-on-conflict
case (DVWriter.AddExisting or similar). The writer can assert uniqueness per
data file path within a single Flush for now, and we can lean on #1050's
validation on the read side until the merge logic lands.
2. v2 pos-delete → v3 DV upgrade - Same story, follow-up issue. Will note
it in the PR description so it's tracked.
On the flush-trigger knob: Java's DeletionVectorWriter doesn't have a
target-file-size-bytes equivalent, it writes one Puffin per commit with
caller-driven flush. So I'll keep the same model (caller calls Flush
explicitly) and we can add a size-based auto-flush later if real workloads need
it.
Starting on PR (1) now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]