> When things are changed such that existing readers are broken, I think you
have failed. People would like to "improve" things, but if you're going to
make a public specification with an unknown set of users, it's bad to break
things.

I believe people using Parquet and creating files have a continuum of
usecases, and there are different combinations of writer settings
appropriate for those usecases:

1. Maximum compatibility (probably only use features in the original
Parquet set or entirely backwards compatible)
...: Some mix
2. Maximum performance / geo spatial / semi structured data / etc (use
newest features)

I am very skeptical that we will ever be at the point of "the ecosystem has
adopted feature X enough that all new writers should use it" as this
depends on what is currently deployed, pre-existing versions of software
and upgrade cycles we don't control



On Wed, Jun 3, 2026 at 12:25 PM Andrew Bell <[email protected]>
wrote:

> On Wed, Jun 3, 2026 at 11:35 AM Ryan Blue <[email protected]> wrote:
>
> > This didn't go through earlier, so I'm resending.
> >
> > > The Apache Parquet mailing list contributors, committers, and PMC does
> > not (and should not) have the luxury of mandating adoption trends (either
> > slower or faster) across the ecosystem.
> >
> > I agree with this. I think the solution is not to limit how fast people
> can
> > adopt or use features. Write-side feature flags are useful for new
> > features.
> >
>
> When things are changed such that existing readers are broken, I think you
> have failed. People would like to "improve" things, but if you're going to
> make a public specification with an unknown set of users, it's bad to break
> things.
>
> IMO you can change the file format all you want, but you need to do it in a
> way that doesn't break existing users. It's not sufficient to look at the
> readers you think exist and say "well, they've been upgraded/fixed". You
> don't know what's out there. If you want to make breaking changes, change
> the version (or call it Farquet or whatever) so that those who wrote code
> for the existing specification can continue to have things work with all
> the files that are "Parquet version X". Even the current proposal to make
> "path_in_schema" optional should require a version change.
>
> --
> Andrew Bell
> [email protected]
>

Reply via email to