Xuanwo commented on code in PR #6367:
URL: https://github.com/apache/arrow-rs/pull/6367#discussion_r1747420929


##########
parquet/src/arrow/async_reader/metadata.rs:
##########
@@ -71,7 +71,10 @@ impl<F: MetadataFetch> MetadataLoader<F> {
         let suffix_len = suffix.len();
 
         let mut footer = [0; 8];
-        footer.copy_from_slice(&suffix[suffix_len - 8..suffix_len]);
+        let Some(footer_start) = suffix_len.checked_sub(8) else {

Review Comment:
   This change looks good to me initially, but I'm wondering about the logic 
behind it.
   
   First of all, it should fail at `file_size < 8`, right? We need to clarify 
this in our API documentation regarding the validation of `file_size`.
   
   Additionally, it's possible that users provide an incorrect file size. The 
result of `fetch.fetch(footer_start..file_size).await?;` seems incorrect. We 
should specify that `fetch` must handle this correctly, returning an error if 
the range of `footer_start..file_size` fails to read. We can add a check here: 
if `suffix_len != file_size - footer_start`, we can raise an error like `EOF`.
   
   What do you think? My motivation is that we should return the error at the 
correct place instead of here, where it is just a side effect.
   
   ---
   
   I believe this error could be reproduced by an invalid file size or a 
malfunctioning `fetch` that returns a smaller size than expected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to