tustvold commented on code in PR #5076:
URL: https://github.com/apache/arrow-rs/pull/5076#discussion_r1392801975
##########
parquet/src/file/statistics.rs:
##########
@@ -152,6 +158,12 @@ pub fn from_thrift(
stats.max_value
};
+ // Whether or not the min/max values are exact. Due to
pre-existing truncation
+ // in other libraries such as parquet-mr, we can't assume that any
given parquet file
Review Comment:
> No, parquet-mr only applies this to binary statistics
In which case perhaps we can have slightly less pessimistic defaulting
behaviour here?
> by splitting the statistics_new_func macro into a
statistics_new_func_always_exact and statistics_new_func_inexact
I think I would prefer to avoid this being a breaking change at all, this
approach would be consistent with other structures in this codebase
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]