pitrou commented on code in PR #46463:
URL: https://github.com/apache/arrow/pull/46463#discussion_r2095236912
##########
cpp/src/parquet/metadata.h:
##########
@@ -143,6 +143,7 @@ class PARQUET_EXPORT ColumnChunkMetaData {
bool is_stats_set() const;
bool is_geo_stats_set() const;
std::shared_ptr<Statistics> statistics() const;
+ std::shared_ptr<EncodedStatistics> encoded_statistics() const;
Review Comment:
Yes, I agree that it can be beneficial, for example if one is only
interested in the null count.
Note that `EncodedStatistics` is [already exposed on
`DataPage`](https://github.com/apache/arrow/blob/7f645d404a16e8c7c939dec70ad61f4ae4de7730/cpp/src/parquet/column_page.h#L68),
so it makes sense to expose it here as well.
A separate improvement would be to make the construction of TypedStatistics
faster, for example creating a decoder instance just to PLAIN-decode one value
does not really make sense. But that's a bit orthogonal IMHO.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]