raulcd opened a new pull request, #46992:
URL: https://github.com/apache/arrow/pull/46992

   ### Rationale for this change
   
   The `is_{min/max}_value_exact` fields exist on the thrift definition and 
some implementations are already using them and truncating them. This PR aims 
to expose those values and to default to true when writing files on C++ as no 
truncation is happening at the moment. If min/max statistics are generated we 
can set those to true.
   
   Truncation for string and binary min/max is out of scope for this PR, we can 
do this on a following one.
   
   ### What changes are included in this PR?
   
   - The fields have been added to EncodedStatistics and Statistics along with 
the Thrift integration.
   - Tests and validation with new parquet-testing file generated where there 
fields are present (https://github.com/apache/parquet-testing/pull/88)
   - Tests with existing files without the fields.
   - Update existing tests to validate the new fields.
   
   ### Are these changes tested?
   
   Yes on CI.
   
   ### Are there any user-facing changes?
   
   Yes, the new fields will be available for the users on the API when reading 
Parquet files.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to