pitrou commented on code in PR #46992: URL: https://github.com/apache/arrow/pull/46992#discussion_r2298202182
########## cpp/src/parquet/statistics.h: ########## @@ -215,12 +220,15 @@ class PARQUET_EXPORT Statistics { /// \param[in] has_min_max whether the min/max statistics are set /// \param[in] has_null_count whether the null_count statistics are set /// \param[in] has_distinct_count whether the distinct_count statistics are set + /// \param[in] is_min_value_exact whether the min value is exact + /// \param[in] is_max_value_exact whether the max value is exact /// \param[in] pool a memory pool to use for any memory allocations, optional static std::shared_ptr<Statistics> Make( const ColumnDescriptor* descr, const std::string& encoded_min, const std::string& encoded_max, int64_t num_values, int64_t null_count, int64_t distinct_count, bool has_min_max, bool has_null_count, - bool has_distinct_count, + bool has_distinct_count, std::optional<bool> is_min_value_exact, + std::optional<bool> is_max_value_exact, Review Comment: Well, I'm not sure allocating a buffer to hold a single integer or double value is a good use of a memory pool (`std::string` should be fine for such usage IMHO, and would actually be faster and smaller thanks to small string optimization). `Encode` could separately take a `MemoryPool` argument if we want to (though it might be similarly wasteful). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org