> When things are changed such that existing readers are broken, I think you have failed. People would like to "improve" things, but if you're going to make a public specification with an unknown set of users, it's bad to break things.
I believe people using Parquet and creating files have a continuum of usecases, and there are different combinations of writer settings appropriate for those usecases: 1. Maximum compatibility (probably only use features in the original Parquet set or entirely backwards compatible) ...: Some mix 2. Maximum performance / geo spatial / semi structured data / etc (use newest features) I am very skeptical that we will ever be at the point of "the ecosystem has adopted feature X enough that all new writers should use it" as this depends on what is currently deployed, pre-existing versions of software and upgrade cycles we don't control On Wed, Jun 3, 2026 at 12:25 PM Andrew Bell <[email protected]> wrote: > On Wed, Jun 3, 2026 at 11:35 AM Ryan Blue <[email protected]> wrote: > > > This didn't go through earlier, so I'm resending. > > > > > The Apache Parquet mailing list contributors, committers, and PMC does > > not (and should not) have the luxury of mandating adoption trends (either > > slower or faster) across the ecosystem. > > > > I agree with this. I think the solution is not to limit how fast people > can > > adopt or use features. Write-side feature flags are useful for new > > features. > > > > When things are changed such that existing readers are broken, I think you > have failed. People would like to "improve" things, but if you're going to > make a public specification with an unknown set of users, it's bad to break > things. > > IMO you can change the file format all you want, but you need to do it in a > way that doesn't break existing users. It's not sufficient to look at the > readers you think exist and say "well, they've been upgraded/fixed". You > don't know what's out there. If you want to make breaking changes, change > the version (or call it Farquet or whatever) so that those who wrote code > for the existing specification can continue to have things work with all > the files that are "Parquet version X". Even the current proposal to make > "path_in_schema" optional should require a version change. > > -- > Andrew Bell > [email protected] >
