liukun4515 commented on PR #1947:
URL: https://github.com/apache/arrow-rs/pull/1947#issuecomment-1168179198

   > I can't help wondering if this was an oversight in the original parquet 
specification, not collocating column chunk metadata in the footer, that has 
since been papered over. All readers I can find simply read the 
ColumnChunkMetadata from the footer and ignore everything else.
   
   I have the same confuse like you about the meatdata.
   I go through the parquet-mr(Java version) which did't append this metadata 
in end of each column, and read this metadata from the Filemetadata in the 
footer.
   
   But from the definition of the format 
https://github.com/apache/parquet-format/blob/54e53e5d7794d383529dd30746378f19a12afd58/src/main/thrift/parquet.thrift#L790,
 we can know the `file_offset` is required field and the 
https://github.com/apache/parquet-format/blob/54e53e5d7794d383529dd30746378f19a12afd58/src/main/thrift/parquet.thrift#L796
 ColumnMetaData is a optional field.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to