Thank you -- that is an excellent point.

I have removed the V1 and V2 columns[1] and applied another suggestion by
Ed[1]to avoid the implication of meaning to V1 and V2.

Andrew

[1]:
https://github.com/apache/parquet-site/pull/186/commits/1126b539491040f97627f720f9c593a9705a4377
[2]:
https://github.com/apache/parquet-site/pull/186/commits/8cb094252a28f8e8b63bd454b4310f50a0e07a61

On Wed, Jun 10, 2026 at 8:51 AM Antoine Pitrou <[email protected]> wrote:

>
> Well, the features table still has columns "V1" and "V2". If we agree
> that 2.0.0 was not special compared to other parquet-format releases (in
> particular, it didn't break forwards compatibility), then why single it
> out?
>
>
>
> Le 10/06/2026 à 13:03, Andrew Lamb a écrit :
> > Thanks for taking a look.
> >
> >> I think this is a nice doc page, except that it's inventing an a
> >> posteriori meaning for "V1" and "V2".
> >
> > I agree that earlier versions of the document did try and invent meaning,
> > which Ed and Jorris pointed out[1], and  I have tried to remove in
> several
> > updates like [2] (and [3] this morning)   I would be happy to remove or
> > reword any additional sections you think are implying such a meaning
> >
> > The current proposed wording is this:
> >
> >> FileMetadata version field
> >>
> >> Each Parquet file has a version field in the thrift FileMetadata. This
> > field has
> >> historically been used inconsistently: writers populate 1 or 2
> >> without a consistent relationship to the features actually used. See the
> >> note in parquet.thrift and this discussion for details.
> >>
> >> parquet-format release versions
> >>
> >> The Thrift definition is released independently of implementations such
> > as
> >> parquet-java or arrow-rs, following the Apache release process. Note
> that
> > release
> >> numbering DOES NOT FOLLOW semantic versioning:
> >>
> >> 1. The major version corresponds to the thrift FileMetadata version
> field.
> >>
> >> 2. Minor releases (e.g. 2.10.0 to 2.11.0) sometimes contain forward
> > incompatible
> >> features. The minor version is not recorded in the file itself.
> >
> > Are there other parts of the document you feel incorrectly imply a
> meaning
> > for V1 and V2?
> >
> > Thanks,
> > Andrew
> >
> > [1]:
> https://github.com/apache/parquet-site/pull/186#discussion_r3380588765
> > [2]:
> >
> https://github.com/apache/parquet-site/pull/186/commits/89159332dc770c64d88f48fcdeb24be53fc82161
> > [3]:
> >
> https://github.com/apache/parquet-site/pull/186/commits/0b3a17f0e8cddc39eeccdc3ca2fbd7e2def0b077
> >
> >
> > On Wed, Jun 10, 2026 at 5:01 AM Antoine Pitrou <[email protected]>
> wrote:
> >
> >>
> >> hi Andrew,
> >>
> >> I think this is a nice doc page, except that it's inventing an a
> >> posteriori meaning for "V1" and "V2". Why is it useful? Why single out
> >> V2 aka. parquet-format 2.0.0?
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 05/06/2026 à 16:18, Andrew Lamb a écrit :
> >>> Dear Parquet Fans,
> >>>
> >>> I have become convinced over the last few discussions that it is more
> >>> important than ever to document clearly what V1 and V2 mean (including
> >> the
> >>> messy reality)
> >>>
> >>> Thus, I spent several hours documenting Parquet features and when each
> >> was
> >>> introduced [1]. I would love any feedback you may have.
> >>>
> >>> Thank you,
> >>> Andrew
> >>>
> >>> [1]: https://github.com/apache/parquet-site/pull/186
> >>>
> >>
> >>
> >>
> >
>
>
>

Reply via email to