emkornfield commented on code in PR #197:
URL: https://github.com/apache/parquet-format/pull/197#discussion_r1318928626


##########
src/main/thrift/parquet.thrift:
##########
@@ -529,7 +596,15 @@ struct DataPageHeader {
   /** Encoding used for repetition levels **/
   4: required Encoding repetition_level_encoding;
 
-  /** Optional statistics for the data in this page**/
+  /**
+   *  Optional statistics for the data in this page.
+   *
+   * For filter use-cases populating data in the page index is generally a 
superior
+   * solution because it allows readers to avoid IO, however not all readers 
make use
+   * of the page index.  For best compatibility both should be populated. If 
the writer

Review Comment:
   I think this has likely become tangential to the PR I'm going to revert this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to