Xuanwo commented on code in PR #6367:
URL: https://github.com/apache/arrow-rs/pull/6367#discussion_r1747420929
##########
parquet/src/arrow/async_reader/metadata.rs:
##########
@@ -71,7 +71,10 @@ impl<F: MetadataFetch> MetadataLoader<F> {
let suffix_len = suffix.len();
let mut footer = [0; 8];
- footer.copy_from_slice(&suffix[suffix_len - 8..suffix_len]);
+ let Some(footer_start) = suffix_len.checked_sub(8) else {
Review Comment:
This change looks good to me initially, but I'm wondering about the logic
behind it.
First of all, it should fail at `file_size < 8`, right? We need to clarify
this in our API documentation regarding the validation of `file_size`.
Additionally, it's possible that users provide an incorrect file size. The
result of `fetch.fetch(footer_start..file_size).await?;` seems incorrect. We
should specify that `fetch` must handle this correctly, returning an error if
the range of `footer_start..file_size` fails to read. We can add a check here:
if `suffix_len != file_size - footer_start`, we can raise an error like `EOF`.
What do you think? My motivation is that we should return the error at the
correct place instead of here, where it is just a side effect.
---
I believe this error could be reproduced by an invalid file size or a
malfunctioning `fetch` that returns a smaller size than expected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]