etseidl commented on code in PR #6367:
URL: https://github.com/apache/arrow-rs/pull/6367#discussion_r1759190210
##########
parquet/src/arrow/async_reader/metadata.rs:
##########
@@ -57,6 +57,12 @@ impl<F: MetadataFetch> MetadataLoader<F> {
return Err(ParquetError::EOF(format!(
"file size of {file_size} is less than footer"
)));
+ } else if let Some(size_hint) = prefetch {
+ if size_hint < 8 {
+ return Err(ParquetError::EOF(format!(
+ "prefetch size of {size_hint} is less than footer size"
+ )));
+ }
Review Comment:
I'd prefer to not penalize a user for over specifying the prefetch. I think
of it as a budget... you can use up to 2MB of prefetch. I can see angry AWS
customers complaining that a metadata fetch that should have been a single GET
is now 3 😅.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]