fallintoplace opened a new pull request, #10003: URL: https://github.com/apache/arrow-rs/pull/10003
# Which issue does this PR close? Closes #10002. # Rationale for this change Malformed Parquet footer metadata can contain INT96 statistics whose encoded min or max value is longer than 12 bytes. The footer metadata conversion path checked that INT96 statistics were at least 12 bytes, but then asserted they were exactly 12 bytes. That allowed malformed input to panic instead of returning an error. The page-statistics path already returns an error for non-12-byte INT96 statistics, so this change makes the footer metadata path behave consistently. # What changes are included in this PR? This PR replaces the INT96 min/max length assertions in footer metadata statistics conversion with explicit `ParquetError` returns. It also adds a regression test covering overlong INT96 min and max values in column metadata statistics. # Are these changes tested? Yes. I ran: - `cargo fmt --all` - `cargo +stable fmt --all -- --check` - `cargo fmt -p parquet -- --check --config skip_children=true $(find ./parquet -name "*.rs" ! -name format.rs)` - `cargo test -p parquet --lib file::metadata::thrift::tests::test_convert_stats_returns_error_for_overlong_int96_statistics` - `cargo test -p parquet --lib file::metadata::thrift::tests` - `cargo test -p parquet` - `cargo check -p parquet --all-targets` - `cargo clippy -p parquet --all-targets --all-features -- -D warnings` # Are there any user-facing changes? Malformed INT96 column metadata statistics now return an error instead of panicking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
