rdblue opened a new pull request, #16747:
URL: https://github.com/apache/iceberg/pull/16747

   This is a simple implementation of Mumbling bitmap that has been proposed 
for embedded bitmaps in v4 metadata.
   
   This implementation includes 3 main classes:
   * `BitPacking`: bit packing and unpacking implementations for specific 
widths (1-7) used for descriptor encoding
   * `PFOREncoding`: [patched 
frame-of-reference](https://github.com/apache/iceberg/blob/main/format/mumbling-spec.md#pfor-encoding)
 encoding and decoding for descriptor bytes
   * `MumblingBitmap: a read-only implementation of Mumbling bitmap
   
   Support for creating and modifying bitmaps will be added in later PRs, 
similar to the approach for variant where `SerializedValue` implementations 
were added first as a building block for mutable implementations.
   
   This also includes a benchmark to compare the PFOR implementation to 
JavaFastPFOR. This is not a fair comparison because JavaFastPFOR is intended 
for large arrays and vectorization, but the use cases tested are very small 
arrays that don't benefit from vectorization and have high overhead. The reason 
for the benchmark is to show that it doesn't make sense to delegate to 
JavaFastPFOR for a small descriptor array. This benchmark probably won't be 
committed as it is now in the final version, but I wanted to make it available 
for reviewers.
   
   Co-Authored-By: Claude Code (Opus 4.7, 1M context) <[email protected]>


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to