kylebarron opened a new pull request, #7334: URL: https://github.com/apache/arrow-rs/pull/7334
# Which issue does this PR close? Closes https://github.com/apache/arrow-rs/issues/5979. # Rationale for this change Avoid needing to know file size in advance to read Parquet files. # What changes are included in this PR? - Add `MetadataSuffixFetch` trait which expands on `MetadataFetch` to additionally support reading suffix range requests. - **Breaking**: Change `ParquetObjectReader::new` to a signature of `new(store: Arc<dyn ObjectStore>, path: Path, file_size: Option<usize>)`, so that `file_size` is no longer required. - Implement `MetadataSuffixFetch` for `ParquetObjectReader`, using `ObjectStore::get_opts`. - Always prefer reading metadata via bounded range requests if the file size is provided, and only fall back to suffix range requests if the file size is `None`. Todo: - [ ] Add a couple tests for this # Are there any user-facing changes? Yes, breaking change to the signature of `ParquetObjectReader::new`. I'd like to get this in to https://github.com/apache/arrow-rs/issues/7084. Supersedes and closes https://github.com/apache/arrow-rs/pull/6157. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
