pitrou opened a new issue, #48858:
URL: https://github.com/apache/arrow/issues/48858

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When reading an encrypted Parquet file with a plaintext footer, the Parquet 
reader is able to verify footer integrity by comparing the signature in the 
file with the one computed by encrypting the footer.
   
   However, the way it does this is to first re-serializes the deserialized 
footer using Thrift. This has several issues:
   1. it's inefficient
   2. it's not obvious that it will always produce the same Thrift encoding as 
the original, leading to spurious signature verification failures
   3. if the original footer deserializes to invalid enum values, attempting to 
serialize it again will lead to undefined behavior
   
   Reason 3 is what allowed this to be uncovered by OSS-Fuzz (see 
https://oss-fuzz.com/testcase-detail/4740205688193024).
   
   For these reasons, it would be better to reuse the existing serialized 
metadata from the footer.
   
   
   ### Component(s)
   
   C++, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to