emkornfield commented on code in PR #242:
URL: https://github.com/apache/parquet-format/pull/242#discussion_r1608953299


##########
README.md:
##########
@@ -107,12 +113,97 @@ start locations.  More details on what is contained in 
the metadata can be found
 in the Thrift definition.
 
 Metadata is written after the data to allow for single pass writing.
+This is especially useful when writing to backends such as S3.
 
 Readers are expected to first read the file metadata to find all the column
 chunks they are interested in.  The columns chunks should then be read 
sequentially.
 
  ![File 
Layout](https://raw.github.com/apache/parquet-format/master/doc/images/FileLayout.gif)
 
+### Parquet 3
+
+Parquet 3 files have the following overall structure:
+
+```
+4-byte magic number "PAR1"
+4-byte magic number "PAR3"
+
+<Column 1 Chunk 1 + Column Metadata>
+<Column 2 Chunk 1 + Column Metadata>
+...
+<Column N Chunk 1 + Column Metadata>
+<Column 1 Chunk 2 + Column Metadata>
+<Column 2 Chunk 2 + Column Metadata>
+...
+<Column N Chunk 2 + Column Metadata>
+...
+<Column 1 Chunk M + Column Metadata>
+<Column 2 Chunk M + Column Metadata>
+...
+<Column N Chunk M + Column Metadata>
+
+<File-level Column 1 Metadata v3>
+...
+<File-level Column N Metadata v3>
+
+File Metadata v3
+4-byte length in bytes of File Metadata v3 (little endian)
+4-byte magic number "PAR3"
+
+File Metadata
+4-byte length in bytes of File Metadata (little endian)
+4-byte magic number "PAR1"
+```
+
+Unlike the legacy File Metadata, the File Metadata v3 is designed to be 
light-weight
+to decode, regardless of the number of columns in the file. Individual column
+metadata can be opportunistically decoded depending on actual needs.
+
+This file structure is backwards-compatible. Parquet 1 readers will read and
+decode the legacy File Metadata in the file footer, while Parquet 3 readers
+will notice the "PAR3" magic number just before the File Metadata and will
+instead read and decode the File Metadata v3.

Review Comment:
   I think for 2, thrift could avoid parsing it assuming that we still follow 
the pattern of nested footer.
   
   e.g. `<field_marker 10003 and byte size><[serialized v3  metadatadata] + <v3 
trailing bits (length, digest, feature bitmask)>"PAR3">0x0000<footer size>PAR1` 
as long as the byte size in the thrift header accounts for everything through 
`PAR3` (as  @alkis mentions below) it should work. 
   
   So the encoding/serialization would be manual but on decoding old readers 
should automatically drop the unknown field (it is possible some thrift 
implementations retain unknown fields, I know proto does) (i.e. the field ID 
10003 should never actually be modeled in the schema).
   
   "note 0x0000" is the stop field for structs if I am reading the [thrift spec 
correctly](https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md#struct)
   
   So the trade-offs of doing this approach are:
   1.  A bit of extra data to be copied for readers accessing the original 
version.
   2. A guaranteed lower bound on amount of IO operations for V3 since it is 
incorporated into v2
   3. Potentially more memory utilization if accessing the original version if 
unknown fields are maintained by thrift implementation.
   
   Effectively for doing the operation currently as proposed in V3 the 
trade-offs are reverse.  I



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to