JingsongLi opened a new pull request, #7934: URL: https://github.com/apache/paimon/pull/7934
Introduce a new .row file format optimized for fast point lookups by row number, designed for deletion vector applications and changelog materialization. The format stores data in ZSTD-compressed blocks with a block index enabling binary search by row number. Key components: - RowFormatWriter/Reader: block-level write and read with projection and selection (RoaringBitmap) pushdown - BlockPrefetcher: concurrent IO with range coalescing (merges adjacent blocks within 256KB gap, up to 2MB per range) and prefetch sliding window - InputStreamPool: lazy stream pool that opens streams on demand for concurrent reads - RowBlockWriter/Reader: compact row serialization supporting all Paimon types including nested ARRAY, MAP, ROW, and VARIANT - RowBlockIndex: delta+zigzag+varint encoded block metadata - Documentation: rowformat.md specification and fileformat.md updates -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
