adamreeve commented on code in PR #242:
URL: https://github.com/apache/parquet-format/pull/242#discussion_r1629579263
##########
src/main/thrift/parquet.thrift:
##########
@@ -885,6 +971,44 @@ struct ColumnChunk {
9: optional binary encrypted_column_metadata
}
+struct ColumnChunkV3 {
+ /** File where column data is stored. **/
+ 1: optional string file_path
Review Comment:
@alkis, no, there is no `_metadata` column in the schema. There is a file
named `_metadata` which contains a copy of the metadata from all N Parquet
files, with the only difference being that the metadata in this file has the
`file_path` field set to the path of the file containing the data corresponding
to each metadata copy. This `_metadata` file contains no data pages itself, but
can be used like an index to determine which file to read data from based on
the metadata.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]