jhorstmann opened a new pull request, #8102: URL: https://github.com/apache/arrow-rs/pull/8102
# Which issue does this PR close? This is a proof of concept for possibly the fastest way of decoding bitpacked data, and also a showcase why writing short rle runs can be detrimental to performance if decoding of bitpacked data is fast. - Related to #7739. # Rationale for this change At the moment I'm not proposing to integrate this into the arrow codebase, the code would need further changes to support processing arbitrary batch sizes and is currently also only supporting a single bitwidth and only `u8` as the target data type. # What changes are included in this PR? The benchmark includes a custom rle/bitpacking hybrid encoder that only supports a bitwidth of 1 and only writes bitpacked runs. On mostly random input data, the size of the encoded buffer is comparable to the size created by the standard `RleEncoder`. Decoding using standard `RleDecoder` also shows that decoding bitpacked data is slightly faster. To run the benchmarks, you will need Rust 1.89, which stabilized avx512 support, and an avx512-capable machine (at least Intel Icelake or AMD Zen4). ``` $ RUSTFLAGS="-Ctarget-cpu=native" cargo bench --features experimental --bench rle ``` ``` rle_decoder/decode_bitpacked time: [398.06 µs 398.66 µs 399.32 µs] thrpt: [2.6259 Gelem/s 2.6302 Gelem/s 2.6342 Gelem/s] rle_decoder/decode_hybrid time: [540.05 µs 542.22 µs 544.84 µs] thrpt: [1.9246 Gelem/s 1.9339 Gelem/s 1.9416 Gelem/s] ``` The results are more interesting when decoding with a custom, AVX512-VBMI optimized decoder: ``` custom/decode_bitpacked time: [17.642 µs 17.661 µs 17.683 µs] thrpt: [59.297 Gelem/s 59.372 Gelem/s 59.435 Gelem/s] custom/decode_hybrid time: [87.593 µs 87.866 µs 88.184 µs] thrpt: [11.891 Gelem/s 11.934 Gelem/s 11.971 Gelem/s] ``` Decoding bitpacked data gets a speedup of about 22x, while decoding hybrid rle data *only* gets about 6x faster. My guess would be that this is caused by branch prediction, or the call to a `memset`-like function which is not optimized for short data. # Are these changes tested? # Are there any user-facing changes? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org