wgtmac commented on code in PR #250:
URL: https://github.com/apache/parquet-format/pull/250#discussion_r1621000028
##########
src/main/thrift/parquet.thrift:
##########
@@ -883,13 +928,42 @@ struct ColumnChunk {
/** Encrypted column metadata for this chunk **/
9: optional binary encrypted_column_metadata
+ /**
+ * The column order for this chunk.
+ *
+ * If not set readers should check FileMetadata.column_orders
+ * instead.
+ *
+ * Populated in both PAR1 and PAR3
+ */
+ 10: optional ColumnOrder column_order
Review Comment:
I'm afraid this might complicate page index because we have to check
consistency of ColumnOrder across row groups. At the moment we have only one
ColumnOrder, which is not a big issue. It may have problem if we introduce more
orders in the future. This order is important to guide us on how to interpret
serialized min_value/max_values in the statistics. Perhaps we can put this in
the `SchemaElement` instead?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]