JFinis commented on code in PR #250:
URL: https://github.com/apache/parquet-format/pull/250#discussion_r1624058195
##########
src/main/thrift/parquet.thrift:
##########
@@ -883,13 +928,42 @@ struct ColumnChunk {
/** Encrypted column metadata for this chunk **/
9: optional binary encrypted_column_metadata
+ /**
+ * The column order for this chunk.
+ *
+ * If not set readers should check FileMetadata.column_orders
+ * instead.
+ *
+ * Populated in both PAR1 and PAR3
+ */
+ 10: optional ColumnOrder column_order
+ /** Set to true if all pages in the column chunk are dictionary
+ * encoded
+ */
+ 11: optional bool all_pages_dictionary_encoded
+ /**
+ * The index to the SchemaElement in FileMetadata for this
+ * column.
+ */
+ 12: optional i32 schema_index
Review Comment:
Also note that without the order being the same as the schema, the whole
"random access" idea of this PR goes out of the window.
It doesn't help me to have a MetadataPage with random access, if I don't
know which column I need to access in the first place.
Instead, what I want is that if I want to access the third column in the
schema, then I need to access the third column chunk.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]