Thanks for taking a look. > I think this is a nice doc page, except that it's inventing an a > posteriori meaning for "V1" and "V2".
I agree that earlier versions of the document did try and invent meaning, which Ed and Jorris pointed out[1], and I have tried to remove in several updates like [2] (and [3] this morning) I would be happy to remove or reword any additional sections you think are implying such a meaning The current proposed wording is this: > FileMetadata version field > > Each Parquet file has a version field in the thrift FileMetadata. This field has > historically been used inconsistently: writers populate 1 or 2 > without a consistent relationship to the features actually used. See the > note in parquet.thrift and this discussion for details. > > parquet-format release versions > > The Thrift definition is released independently of implementations such as > parquet-java or arrow-rs, following the Apache release process. Note that release > numbering DOES NOT FOLLOW semantic versioning: > > 1. The major version corresponds to the thrift FileMetadata version field. > > 2. Minor releases (e.g. 2.10.0 to 2.11.0) sometimes contain forward incompatible > features. The minor version is not recorded in the file itself. Are there other parts of the document you feel incorrectly imply a meaning for V1 and V2? Thanks, Andrew [1]: https://github.com/apache/parquet-site/pull/186#discussion_r3380588765 [2]: https://github.com/apache/parquet-site/pull/186/commits/89159332dc770c64d88f48fcdeb24be53fc82161 [3]: https://github.com/apache/parquet-site/pull/186/commits/0b3a17f0e8cddc39eeccdc3ca2fbd7e2def0b077 On Wed, Jun 10, 2026 at 5:01 AM Antoine Pitrou <[email protected]> wrote: > > hi Andrew, > > I think this is a nice doc page, except that it's inventing an a > posteriori meaning for "V1" and "V2". Why is it useful? Why single out > V2 aka. parquet-format 2.0.0? > > Regards > > Antoine. > > > Le 05/06/2026 à 16:18, Andrew Lamb a écrit : > > Dear Parquet Fans, > > > > I have become convinced over the last few discussions that it is more > > important than ever to document clearly what V1 and V2 mean (including > the > > messy reality) > > > > Thus, I spent several hours documenting Parquet features and when each > was > > introduced [1]. I would love any feedback you may have. > > > > Thank you, > > Andrew > > > > [1]: https://github.com/apache/parquet-site/pull/186 > > > > >
