mapleFU commented on code in PR #216:
URL: https://github.com/apache/parquet-format/pull/216#discussion_r1343638311


##########
src/main/thrift/parquet.thrift:
##########
@@ -216,13 +216,22 @@ struct Statistics {
    /** count of distinct values occurring */
    4: optional i64 distinct_count;
    /**
-    * Min and max values for the column, determined by its ColumnOrder.
+    * lower and upper bound values for the column, determined by its 
ColumnOrder.
+    * These may be the actual minimum and maximum values found on a column 
chunk,
+    * but can also be (more compact) values that do not exist on a column 
chunk.
+    * For example, instead of storing "Blart Versenwald III", a writer may set
+    * min_value="B", max_value="C". Such more compact values must still be 
valid
+    * values within the column's logical type.
     *
     * Values are encoded using PLAIN encoding, except that variable-length byte
     * arrays do not include a length prefix.
     */
    5: optional binary max_value;
    6: optional binary min_value;
+   /** If true, max_value is the actual maximum value found on a column chunk 
**/
+   7: optional bool is_max_value_exact;
+   /** If true, min_value is the actual minimum value found on a column chunk 
**/
+   8: optional bool is_min_value_exact;

Review Comment:
   if this is not exists, we don't know that if the min-max is exact?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@parquet.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to