etseidl opened a new pull request, #7369:
URL: https://github.com/apache/arrow-rs/pull/7369

   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and 
enhancements and this helps us generate change logs for our releases. You can 
link an issue to this PR using the GitHub syntax. For example `Closes #123` 
indicates that this PR will close issue #123.
   -->
   
   Might fix #6476.
   
   # Rationale for this change
   `ArrowReaderMetadata::load_async` sometimes had to do multiple passes to 
fully load Parquet metadata when page indexes were requested. This is because 
`AsyncFileReader::get_metadata` function has no way of knowing if page indexes 
are desired. Recent API changes have allowed for passing this information to 
`AsyncFileReader`, so the extra page index logic in `load_async` should no 
longer be necessary.
   
   This version will still do multiple fetches since no prefetch hint is passed 
to the metadata reader. A follow on PR could add this hint to 
`ArrowReaderOptions`, but that would be a breaking change.
   
   # What changes are included in this PR?
   
   Convert `AsyncFileReader::get_metadata` to use the new 
`ParquetMetaDataReader::load_via_suffix_and_finish` API to reduce code 
duplication and add a `MetadataSuffixFetch` implementation to allow its use in 
`ArrowReaderMetadata::load_async`.
   
   # Are there any user-facing changes?
   
   No
   <!--
   If there are user-facing changes then we may require documentation to be 
updated before approving the PR.
   -->
   
   <!---
   If there are any breaking changes to public APIs, please call them out.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to