rdblue opened a new pull request, #16747: URL: https://github.com/apache/iceberg/pull/16747
This is a simple implementation of Mumbling bitmap that has been proposed for embedded bitmaps in v4 metadata. This implementation includes 3 main classes: * `BitPacking`: bit packing and unpacking implementations for specific widths (1-7) used for descriptor encoding * `PFOREncoding`: [patched frame-of-reference](https://github.com/apache/iceberg/blob/main/format/mumbling-spec.md#pfor-encoding) encoding and decoding for descriptor bytes * `MumblingBitmap: a read-only implementation of Mumbling bitmap Support for creating and modifying bitmaps will be added in later PRs, similar to the approach for variant where `SerializedValue` implementations were added first as a building block for mutable implementations. This also includes a benchmark to compare the PFOR implementation to JavaFastPFOR. This is not a fair comparison because JavaFastPFOR is intended for large arrays and vectorization, but the use cases tested are very small arrays that don't benefit from vectorization and have high overhead. The reason for the benchmark is to show that it doesn't make sense to delegate to JavaFastPFOR for a small descriptor array. This benchmark probably won't be committed as it is now in the final version, but I wanted to make it available for reviewers. Co-Authored-By: Claude Code (Opus 4.7, 1M context) <[email protected]> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
