raulcd opened a new pull request, #47466: URL: https://github.com/apache/arrow/pull/47466
### Rationale for this change Currently we drop all statistics if `SortOrder` is `UNKNOWN`. This seems too broad and there are some statistics, like `null_count` that could be maintained. https://github.com/apache/arrow/blob/6f6138b7eedece0841b04f4e235e3bedf5a3ee29/cpp/src/parquet/metadata.cc#L330-L335 Clearing `min/max` but allowing to keep `null_count` when `SortOrder` is `UNKNOWN` would allow users to use them. ### What changes are included in this PR? Maintain Statistics when reading them if `SortOrder::UNKNOWK` but clear min/max ### Are these changes tested? Yes, there is a file on parquet-testing which allows us to validate this exact scenario. ### Are there any user-facing changes? No changes to APIs, users will be able to read statistics on this case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
