fallintoplace opened a new pull request, #10003:
URL: https://github.com/apache/arrow-rs/pull/10003

   # Which issue does this PR close?
   
   Closes #10002.
   
   # Rationale for this change
   
   Malformed Parquet footer metadata can contain INT96 statistics whose encoded 
min or max value is longer than 12 bytes. The footer metadata conversion path 
checked that INT96 statistics were at least 12 bytes, but then asserted they 
were exactly 12 bytes. That allowed malformed input to panic instead of 
returning an error.
   
   The page-statistics path already returns an error for non-12-byte INT96 
statistics, so this change makes the footer metadata path behave consistently.
   
   # What changes are included in this PR?
   
   This PR replaces the INT96 min/max length assertions in footer metadata 
statistics conversion with explicit `ParquetError` returns.
   
   It also adds a regression test covering overlong INT96 min and max values in 
column metadata statistics.
   
   # Are these changes tested?
   
   Yes. I ran:
   
   - `cargo fmt --all`
   - `cargo +stable fmt --all -- --check`
   - `cargo fmt -p parquet -- --check --config skip_children=true $(find 
./parquet -name "*.rs" ! -name format.rs)`
   - `cargo test -p parquet --lib 
file::metadata::thrift::tests::test_convert_stats_returns_error_for_overlong_int96_statistics`
   - `cargo test -p parquet --lib file::metadata::thrift::tests`
   - `cargo test -p parquet`
   - `cargo check -p parquet --all-targets`
   - `cargo clippy -p parquet --all-targets --all-features -- -D warnings`
   
   # Are there any user-facing changes?
   
   Malformed INT96 column metadata statistics now return an error instead of 
panicking.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to