Thank you -- that is an excellent point. I have removed the V1 and V2 columns[1] and applied another suggestion by Ed[1]to avoid the implication of meaning to V1 and V2.
Andrew [1]: https://github.com/apache/parquet-site/pull/186/commits/1126b539491040f97627f720f9c593a9705a4377 [2]: https://github.com/apache/parquet-site/pull/186/commits/8cb094252a28f8e8b63bd454b4310f50a0e07a61 On Wed, Jun 10, 2026 at 8:51 AM Antoine Pitrou <[email protected]> wrote: > > Well, the features table still has columns "V1" and "V2". If we agree > that 2.0.0 was not special compared to other parquet-format releases (in > particular, it didn't break forwards compatibility), then why single it > out? > > > > Le 10/06/2026 à 13:03, Andrew Lamb a écrit : > > Thanks for taking a look. > > > >> I think this is a nice doc page, except that it's inventing an a > >> posteriori meaning for "V1" and "V2". > > > > I agree that earlier versions of the document did try and invent meaning, > > which Ed and Jorris pointed out[1], and I have tried to remove in > several > > updates like [2] (and [3] this morning) I would be happy to remove or > > reword any additional sections you think are implying such a meaning > > > > The current proposed wording is this: > > > >> FileMetadata version field > >> > >> Each Parquet file has a version field in the thrift FileMetadata. This > > field has > >> historically been used inconsistently: writers populate 1 or 2 > >> without a consistent relationship to the features actually used. See the > >> note in parquet.thrift and this discussion for details. > >> > >> parquet-format release versions > >> > >> The Thrift definition is released independently of implementations such > > as > >> parquet-java or arrow-rs, following the Apache release process. Note > that > > release > >> numbering DOES NOT FOLLOW semantic versioning: > >> > >> 1. The major version corresponds to the thrift FileMetadata version > field. > >> > >> 2. Minor releases (e.g. 2.10.0 to 2.11.0) sometimes contain forward > > incompatible > >> features. The minor version is not recorded in the file itself. > > > > Are there other parts of the document you feel incorrectly imply a > meaning > > for V1 and V2? > > > > Thanks, > > Andrew > > > > [1]: > https://github.com/apache/parquet-site/pull/186#discussion_r3380588765 > > [2]: > > > https://github.com/apache/parquet-site/pull/186/commits/89159332dc770c64d88f48fcdeb24be53fc82161 > > [3]: > > > https://github.com/apache/parquet-site/pull/186/commits/0b3a17f0e8cddc39eeccdc3ca2fbd7e2def0b077 > > > > > > On Wed, Jun 10, 2026 at 5:01 AM Antoine Pitrou <[email protected]> > wrote: > > > >> > >> hi Andrew, > >> > >> I think this is a nice doc page, except that it's inventing an a > >> posteriori meaning for "V1" and "V2". Why is it useful? Why single out > >> V2 aka. parquet-format 2.0.0? > >> > >> Regards > >> > >> Antoine. > >> > >> > >> Le 05/06/2026 à 16:18, Andrew Lamb a écrit : > >>> Dear Parquet Fans, > >>> > >>> I have become convinced over the last few discussions that it is more > >>> important than ever to document clearly what V1 and V2 mean (including > >> the > >>> messy reality) > >>> > >>> Thus, I spent several hours documenting Parquet features and when each > >> was > >>> introduced [1]. I would love any feedback you may have. > >>> > >>> Thank you, > >>> Andrew > >>> > >>> [1]: https://github.com/apache/parquet-site/pull/186 > >>> > >> > >> > >> > > > > >
