wgtmac commented on code in PR #250:
URL: https://github.com/apache/parquet-format/pull/250#discussion_r1620984136
##########
src/main/thrift/parquet.thrift:
##########
@@ -812,8 +837,20 @@ struct ColumnMetaData {
/** Set of all encodings used for pages in this column chunk.
* This information can be used to determine if all data pages are
- * dictionary encoded for example **/
+ * dictionary encoded for example
+ *
+ * PAR1: Optional. May be deprecated in a future release in favor
+ * serialized_encoding_stats.
+ * PAR3: Don't populate. Write serialized_page_encoding_stats.
+ **/
13: optional list<PageEncodingStats> encoding_stats;
+ /**
+ * Serialized page encoding stats.
+ *
+ * PAR1: Start populating after encoding_stats is deprecated.
+ * PAR3: Populate instead of encoding_stats.
+ */
+ 17: optional binary serialized_encoding_stats
Review Comment:
I think it makes sense to adopt @alkis's idea of using a boolean field here
to indicate whether dictionary encoding is applied to all data pages.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]