Hey Andrew (Bell), I'm working on a cookbook and this would be a great addition. Let me know if you'd like to contribute a section on metadata descriptions. Will reach out once an initial draft is ready.
Thanks Arnav On Wed, Oct 15, 2025 at 2:55 AM Andrew Lamb <[email protected]> wrote: > For additional spec reading pleasure, the format of the parquet.thrift file > is Thrift Interface Definition Language[1]. > > Parquet metadata is stored using the binary Thrift Compact Protocol[2]. > > [1]: https://github.com/apache/thrift/blob/master/doc/specs/idl.md > [2]: > > https://github.com/apache/thrift/blob/master/doc/specs/thrift-compact-protocol.md > > On Tue, Oct 14, 2025 at 5:04 PM Sylvain Lesage <[email protected] > > > wrote: > > > Maybe the comments in the specification ( > > > https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift > ) > > are sufficiently clear? > > > > > > On Tuesday, 14 October 2025 at 10:50 PM, Andrew Bell < > > [email protected]> wrote: > > > > > > > > > > > Hi, > > > > > > Is there a document that explains the metadata ( > > > https://parquet.apache.org/docs/file-format/metadata) in English? I > can > > > read code, but I'd rather not :) There seems to be some hand-wavy > > language > > > that defines certain bits, but I haven't found anything that defines > each > > > field in the metadata or anything that really defines the format itself > > > other than the metadata picture and this: > > > https://parquet.apache.org/docs/file-format > > > > > > Thanks, > > > > > > -- > > > Andrew Bell > > > [email protected] > > >
