raulcd opened a new pull request, #46992:
URL: https://github.com/apache/arrow/pull/46992
### Rationale for this change
The `is_{min/max}_value_exact` fields exist on the thrift definition and
some implementations are already using them and truncating them. This PR aims
to expose those values and to default to true when writing files on C++ as no
truncation is happening at the moment. If min/max statistics are generated we
can set those to true.
Truncation for string and binary min/max is out of scope for this PR, we can
do this on a following one.
### What changes are included in this PR?
- The fields have been added to EncodedStatistics and Statistics along with
the Thrift integration.
- Tests and validation with new parquet-testing file generated where there
fields are present (https://github.com/apache/parquet-testing/pull/88)
- Tests with existing files without the fields.
- Update existing tests to validate the new fields.
### Are these changes tested?
Yes on CI.
### Are there any user-facing changes?
Yes, the new fields will be available for the users on the API when reading
Parquet files.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]