Hi Dan,
Thanks for raising this.  The document asks 4 questions, I'll put my own
thoughts here:

Do we agree the time-based, per-feature gate is challenging in the
> long-term as the primary coordination mechanism?


I don't agree with this at the moment.  I think we can revisit this when we
see concrete issues arising, before that it seems like we are making more
work based on assumptions of where the ecosystem is headed.  I think one
still faces mostly the same decisions with a monolithic version scheme (you
still need to decide when the new version becomes the default).

I agree with Andrew that building out the compatibility matrix helps people
make informed decisions here.

Is a deliberately curated capability level — "V3" — the right vehicle for
> setting shared reader/writer expectations, building on issue #384?


I think until we see otherwise the parquet spec version or feature year is
sufficient here.  In general, what I've observed for monolithic version
numbering is that it:

1.  It either slows down adoption of things people really want to use and
know that the feature won't cause breakages.
2.  It doesn't really help with adoption because implementers still pick
and choose what they see as valuable or have time to do.  We end up with a
lot of implementations saying we support version "3" with the exceptions of
feature X,Y,Z

That being said, I think it would be useful for implementations to consider
how to let users choose on the spectrum from "conservative for
compatibility" to "bleeding edge" without having to toggle each feature
individually.  As mentioned above, I think using Parquet format version or
feature year to help toggle these is likely a good place to start.  I also
feel that this is something that implementations can choose on their own.


> Which features belong in the first such bundle, given how far adoption of
> things like Variant and Geo has already run ahead?


Variant and Geo are probably a good discussion point.  What do you feel
would change with adoption of these types if Parquet where to start having
numbered versions again?  How has it hindered their adoption not having a
V3 to include them in?  How would having a V3 increase their adoption?


> How far can and should we go in aligning the magic number, footer version,
> and release version?


In practice this feels like generally more churn then it is worth.  There
are many parquet readers out there that don't align with parquet-java
versioning, so users  are still going to be reading the release notes or
the compatibility matrix to figure out what they need/want.   It would be
good to make sure implementation actually follow the spec on "footer
version", so it can be used as a knob at some point.

Cheers,
Micah

On Tue, Jun 2, 2026 at 5:55 AM Andrew Lamb <[email protected]> wrote:

> Thank you Dan, this is a very clear document
>
> I think this is the most important part and worth posting to the mailing
> list
>
> > The two-year norm also doesn't match what's actually happening in the
> ecosystem. Some features are entering mainstream usage well ahead of any
> such window — Variant and the Geo types are being adopted aggressively by
> writers and engines because the demand is real and immediate.
>
> In my opinion, the Apache Parquet mailing list contributors, committers,
> and PMC does not (and should not) have the luxury of mandating adoption
> trends (either slower or faster) across the ecosystem.
>
> As this point in your document makes clear, there are many Parquet
> stakeholders, each with different constraints and needs, that will adopt
> the features at their own rate. We shouldn't be trying to hold them back
> from using new features.
>
> I believe the best thing we can do as a community is foster clear
> communication that helps implementers make the best decisions for their own
> adoption. Specifically this is embodied in the "implementation status"
> page[1] which we can and should continue to evolve to let Parquet users
> choose the feature set that is right for them
>
> Andrew
>
> [1]: https://parquet.apache.org/docs/file-format/implementationstatus/
>
> On Mon, Jun 1, 2026 at 5:36 PM Daniel Weeks <[email protected]> wrote:
>
> > Hey Parquet Community,
> >
> > A few weeks back during one of the community syncs, the topic of
> versioning
> > came up (again) and I offered to pull together some thoughts on how we
> > might want to move forward.
> >
> > I've gathered some of the background and concerns about how we address
> > versioning across the ecosystem in order to have a discussion and gather
> > feedback.
> >
> > There are a lot of new features and major capabilities that community
> > members are eager to introduce, so it would be great to have a clear path
> > forward on how to coordinate changes.
> >
> > I've included the discussion in a doc
> > <
> >
> https://docs.google.com/document/d/1zrbGT4kRCEdadBUludwfQR9b2CfLgH-RWn9zE84gYfg/edit?tab=t.0#heading=h.aozivdm2oj4d
> > >
> > so that people can comment and respond either in-line or on this thread.
> >
> > Looking forward to discussion and feedback!
> > -Dan
> >
>

Reply via email to