ggershinsky commented on code in PR #242:
URL: https://github.com/apache/parquet-format/pull/242#discussion_r1609368695


##########
src/main/thrift/parquet.thrift:
##########
@@ -1165,6 +1317,62 @@ struct FileMetaData {
   9: optional binary footer_signing_key_metadata
 }
 
+/** Metadata for a column in this file. */
+struct FileColumnMetadataV3 {
+  /** All column chunks in this file (one per row group) **/

Review Comment:
   add _in this column_ in the comment? also, rename the field to `chunks`?
   ```thrift
   /** Metadata for a column in this file. */
   struct FileColumnMetadataV3 {
     /** All column chunks in this column (in this file - one per row group) **/
     1: required list<ColumnChunkV3> chunks
   ```



##########
src/main/thrift/parquet.thrift:
##########
@@ -885,6 +973,44 @@ struct ColumnChunk {
   9: optional binary encrypted_column_metadata
 }
 
+struct ColumnChunkV3 {
+  /** File where column data is stored. **/
+  1: optional string file_path
+
+  /** Byte offset in file_path to the ColumnChunkMetaDataV3, optionally 
encrypted
+   ** CHANGED from v1: renamed to metadata_file_offset
+   **/
+  2: required i64 metadata_file_offset
+
+  /** NEW from v1: Byte length in file_path of ColumnChunkMetaDataV3, 
optionally encrypted
+   **/
+  3: required i32 metadata_file_length
+
+  /** REMOVED from v1: meta_data, encrypted_column_metadata.
+   ** Use encoded_metadata instead.
+   **/
+
+  /** NEW from v1: Column metadata for this chunk, duplicated here from 
file_path.
+   ** This is a Thrift-encoded ColumnChunkMetaDataV3, optionally encrypted.
+   **/
+  4: optional binary encoded_metadata
+
+  /** File offset of ColumnChunk's OffsetIndex **/
+  5: optional i64 offset_index_offset
+
+  /** Size of ColumnChunk's OffsetIndex, in bytes **/
+  6: optional i32 offset_index_length
+
+  /** File offset of ColumnChunk's ColumnIndex **/
+  7: optional i64 column_index_offset
+
+  /** Size of ColumnChunk's ColumnIndex, in bytes **/
+  8: optional i32 column_index_length
+
+  /** Crypto metadata of encrypted columns **/
+  9: optional ColumnCryptoMetaData crypto_metadata

Review Comment:
   not needed if we keep this in FileColumnMetadataV3



##########
src/main/thrift/parquet.thrift:
##########
@@ -1165,6 +1317,62 @@ struct FileMetaData {
   9: optional binary footer_signing_key_metadata
 }
 
+/** Metadata for a column in this file. */
+struct FileColumnMetadataV3 {
+  /** All column chunks in this file (one per row group) **/
+  1: required list<ColumnChunkV3> columns
+
+  /** Sort order used for the Statistics min_value and max_value fields
+   **/
+  2: optional ColumnOrder column_order;
+
+  /** NEW from v1: Optional key/value metadata for this column at the file 
level
+   **/
+  3: optional list<KeyValue> key_value_metadata
+}

Review Comment:
   can add the column encryption key_metadata here,
   ```thrift
   /** Crypto metadata of encrypted column **/
     4: optional ColumnCryptoMetaData crypto_metadata
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to