etseidl commented on PR #8080:
URL: https://github.com/apache/arrow-rs/pull/8080#issuecomment-3190064509

   Ok, I'm starting to grok this. I merged this branch into my current thrift 
branch, and changed `try_decode` to 
   ```rust
     pub fn try_decode(
         &mut self,
     ) -> std::result::Result<DecodeResult<ParquetMetaData>, ParquetError> {
         if self.done {
             return Ok(DecodeResult::Finished);
         }
   
         // need to have the last 8 bytes of the file to decode the metadata
         let file_len = self.buffers.file_len();
         if !self.buffers.has_range(&(file_len - 8..file_len)) {
             #[expect(clippy::single_range_in_vec_init)]
             return Ok(DecodeResult::NeedsData(vec![file_len - 8..file_len]));
         }
   
         // Try to parse the metadata from the buffers we have.
         // If we don't have enough data, it will return a 
`ParquetError::NeedMoreData`
         // with the number of bytes needed to complete the metadata parsing.
         // If we have enough data, it will return `Ok(())` and we can
         let footer_bytes = self
             .buffers
             .get_bytes(file_len - FOOTER_SIZE as u64, FOOTER_SIZE)?;
         let mut footer = [0_u8; FOOTER_SIZE];
         footer_bytes.as_ref().copy_to_slice(&mut footer);
         let footer = ParquetMetaDataReader::decode_footer_tail(&footer)?;
   
         let metadata_len = footer.metadata_length();
         let footer_metadata_len = FOOTER_SIZE + metadata_len;
         let footer_start = file_len - footer_metadata_len as u64;
         let footer_end = file_len - FOOTER_SIZE as u64;
         if !self.buffers.has_range(&(footer_start..footer_end)) {
             #[expect(clippy::single_range_in_vec_init)]
             return Ok(DecodeResult::NeedsData(vec![footer_start..file_len]));
         }
   
         let metadata_bytes = self.buffers.get_bytes(footer_start, 
metadata_len)?;
         let metadata = 
ParquetMetaDataReader::decode_file_metadata(&metadata_bytes)?;
         self.done = true;
         Ok(DecodeResult::Data(metadata))
     }
   ```
   No page indexes yet, but this seems pretty nice 👍 Once I have the page 
indexes converted the parser should get pretty simple.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to