Baunsgaard opened a new pull request, #16440: URL: https://github.com/apache/iceberg/pull/16440
Add `VectorizedPositionDeleteReader`, an Arrow-vectorized reader for V2 position delete files. Reads `(file_path, pos)` directly from Arrow `VarChar` / `BigInt` buffers and feeds the shared `RangeAccumulator` from #16052, so consecutive positions become `PositionDeleteIndex.delete(start, end)` range inserts. No per-row Java allocation on the hot path. Stacked on #16052, that PR adds the coalescing primitive in `iceberg-core`; this PR wires it into the Arrow read path. `BaseDeleteLoader.readPosDeletes` now dispatches through a new `PositionDeleteIndexReader` SPI on `FormatModelRegistry`, so the Arrow path is picked up automatically when `iceberg-arrow` is on the classpath. Like #16052, this primarily benefits Iceberg V2 tables; V3 DVs deserialize directly and bypass both paths. Benchmark numbers will be posted as a follow-up comment. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
