pitrou commented on code in PR #46463:
URL: https://github.com/apache/arrow/pull/46463#discussion_r2095236912


##########
cpp/src/parquet/metadata.h:
##########
@@ -143,6 +143,7 @@ class PARQUET_EXPORT ColumnChunkMetaData {
   bool is_stats_set() const;
   bool is_geo_stats_set() const;
   std::shared_ptr<Statistics> statistics() const;
+  std::shared_ptr<EncodedStatistics> encoded_statistics() const;

Review Comment:
   Yes, I agree that it can be beneficial, for example if one is only 
interested in the null count.
   
   Note that `EncodedStatistics` is [already exposed on 
`DataPage`](https://github.com/apache/arrow/blob/7f645d404a16e8c7c939dec70ad61f4ae4de7730/cpp/src/parquet/column_page.h#L68),
 so it makes sense to expose it here as well.
   
   A separate improvement would be to make the construction of TypedStatistics 
faster, for example creating a decoder instance just to PLAIN-decode one value 
does not really make sense. But that's a bit orthogonal IMHO.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to