AntoinePrv opened a new issue, #47895: URL: https://github.com/apache/arrow/issues/47895
### Describe the enhancement requested Right now, the `unpack` family of functions extract fewer elements than requested. This is because it relies on batch extraction that must process many inputs at once. Instead the `BitReader::GetBatch` is responsible for handling inputs before (prolog) and after (epilog) `unpack`. This has two downsides: - It makes the general parquet C++ logic harder to understand, as related functions are spread apart; - I makes `unpack` harder to (re)use as it does not fully extract all that is needed. In particular, it makes it hard to iterate on these functions because the tests/benchmarks would need to adapt to the number of element that the function can work with. The prolog and epilog should be moved to the `unpack` functions so that one function is fully responsible for unpacking integers without extra complexity. ### Component(s) C++, Parquet -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
