AntoinePrv opened a new issue, #47895:
URL: https://github.com/apache/arrow/issues/47895

   ### Describe the enhancement requested
   
   Right now, the `unpack` family of functions extract fewer elements than 
requested.
   This is because it relies on batch extraction that must process many inputs 
at once.
   Instead the `BitReader::GetBatch` is responsible for handling inputs before 
(prolog) and after (epilog) `unpack`.
   
   This has two downsides:
   - It makes the general parquet C++ logic harder to understand, as related 
functions are spread apart;
   - I makes `unpack` harder to (re)use as it does not fully extract all that 
is needed. In particular, it makes it hard to iterate on these functions 
because the tests/benchmarks would need to adapt to the number of element that 
the function can work with.
   
   The prolog and epilog should be moved to the `unpack` functions so that one 
function is fully responsible for unpacking integers without extra complexity.
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to