This didn't go through earlier, so I'm resending.

> The Apache Parquet mailing list contributors, committers, and PMC does
not (and should not) have the luxury of mandating adoption trends (either
slower or faster) across the ecosystem.

I agree with this. I think the solution is not to limit how fast people can
adopt or use features. Write-side feature flags are useful for new features.

But we need a way to allow end users to reason about the features of
Parquet that they are using, not just leave everyone to figure out what set
of things each of their engines can support at a given time.

> I believe the best thing we can do as a community is foster clear
communication that helps implementers make the best decisions for their own
adoption.

+1 to this as well. I'm skeptical that a big list of features is the right
path to do this. This gets complicated very quickly as we add new
incompatibilities. It probably works for big features like ALP, FAST, and
PFOR, but when we start adding in a lot of minor breaking changes we are
going to lose people. The second issue with this is that we don't create an
incentive to implement all of the features that are possible, so we end up
with fragmentation.

I think the right path forward is a small set of write-side flags for new
features, and periodic points  (versions?) where we break forward
compatibility to ship the rest. That keeps the list of things to worry
about (and research across engines) small and allows the project to move
forward. It also puts pressure on implementations to be compatible with
newer versions.

On Tue, Jun 2, 2026 at 5:55 AM Andrew Lamb <[email protected]> wrote:

> Thank you Dan, this is a very clear document
>
> I think this is the most important part and worth posting to the mailing
> list
>
> > The two-year norm also doesn't match what's actually happening in the
> ecosystem. Some features are entering mainstream usage well ahead of any
> such window — Variant and the Geo types are being adopted aggressively by
> writers and engines because the demand is real and immediate.
>
> In my opinion, the Apache Parquet mailing list contributors, committers,
> and PMC does not (and should not) have the luxury of mandating adoption
> trends (either slower or faster) across the ecosystem.
>
> As this point in your document makes clear, there are many Parquet
> stakeholders, each with different constraints and needs, that will adopt
> the features at their own rate. We shouldn't be trying to hold them back
> from using new features.
>
> I believe the best thing we can do as a community is foster clear
> communication that helps implementers make the best decisions for their own
> adoption. Specifically this is embodied in the "implementation status"
> page[1] which we can and should continue to evolve to let Parquet users
> choose the feature set that is right for them
>
> Andrew
>
> [1]: https://parquet.apache.org/docs/file-format/implementationstatus/
>
> On Mon, Jun 1, 2026 at 5:36 PM Daniel Weeks <[email protected]> wrote:
>
> > Hey Parquet Community,
> >
> > A few weeks back during one of the community syncs, the topic of
> versioning
> > came up (again) and I offered to pull together some thoughts on how we
> > might want to move forward.
> >
> > I've gathered some of the background and concerns about how we address
> > versioning across the ecosystem in order to have a discussion and gather
> > feedback.
> >
> > There are a lot of new features and major capabilities that community
> > members are eager to introduce, so it would be great to have a clear path
> > forward on how to coordinate changes.
> >
> > I've included the discussion in a doc
> > <
> >
> https://docs.google.com/document/d/1zrbGT4kRCEdadBUludwfQR9b2CfLgH-RWn9zE84gYfg/edit?tab=t.0#heading=h.aozivdm2oj4d
> > >
> > so that people can comment and respond either in-line or on this thread.
> >
> > Looking forward to discussion and feedback!
> > -Dan
> >
>

Reply via email to