westonpace opened a new issue, #36778: URL: https://github.com/apache/arrow/issues/36778
### Describe the enhancement requested We do actually already have an asynchronous version with the method `GetRecordBatchGenerator`. However, that method does not respect the batch size property and it is not possible to read less than 1 entire row group at a time. This makes it difficult to read large parquet files and makes the scanner's memory usage very dependent on a parquet file's row group size. This PR is requesting a new method which is a closer analogue to the existing ReadRowGroup/ReadRowGroups methods. Once the scanner moves over to this new method then I think we can deprecate `GetRecordBatchGenerator`. I hesitate to replace it immediately as I don't want to introduce any breaking changes into the existing scanner path. ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
