Baunsgaard opened a new pull request, #16440:
URL: https://github.com/apache/iceberg/pull/16440

   Add `VectorizedPositionDeleteReader`, an Arrow-vectorized reader for V2 
position delete files. Reads `(file_path, pos)` directly from Arrow `VarChar` / 
`BigInt` buffers and feeds the shared `RangeAccumulator` from #16052, so 
consecutive positions become `PositionDeleteIndex.delete(start, end)` range 
inserts. No per-row Java allocation on the hot path.
   
   Stacked on #16052, that PR adds the coalescing primitive in `iceberg-core`; 
this PR wires it into the Arrow read path.
   
   `BaseDeleteLoader.readPosDeletes` now dispatches through a new 
`PositionDeleteIndexReader` SPI on `FormatModelRegistry`, so the Arrow path is 
picked up automatically when `iceberg-arrow` is on the classpath.
   
   Like #16052, this primarily benefits Iceberg V2 tables; V3 DVs deserialize 
directly and bypass both paths.
   
   Benchmark numbers will be posted as a follow-up comment.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to