lidavidm opened a new pull request #9620:
URL: https://github.com/apache/arrow/pull/9620


   This provides an async Parquet reader where the unit of concurrency is a 
single row group. 
   
   There are some caveats still:
   - [ ] This implementation is unsafe if pre_buffer=True in 
ArrowReaderProperties. Instead, the user needs to manually call 
file_reader()->parquet_reader()->PreBuffer(). I expect the kind of application 
using the async reader would also want to control this anyways, so I'd lean 
towards just failing the call if the user has pre_buffer=True, but the other 
commit in this PR provides a version that is safe with pre_buffer=True at the 
cost of some code duplication.
   - [ ] There are some TODOs scattered around for after #9607 is merged.
   - [ ] Docstrings need writing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to