Resending this because it didn't go through from my other account. -1
I think there's a big difference between other features that we want to turn on using write-side flags and this one. This isn't introducing a beneficial feature like encryption or a new encoding. It is just omitting metadata as a minor cleanup. I don't think it makes sense to allow switching this on when we know that it breaks older readers. This exposes the big gap in how we currently think about compatibility. While it makes sense to introduce a few beneficial features with write-side flags, introducing flags for lots of minor cases like this will cause fragmentation and needless incompatibilities through the Parquet ecosystem. I think we need to come up with a way to release these incompatible changes (and update defaults for the feature flags) as a group so that downstream users can reason about compatibility. I'd change my vote to +1 if we had a plan for this. For instance, if we were to say that this is a part of the next group of forward-incompatible changes and had a way to block it being used until then, I would be +1. On Tue, Jun 2, 2026 at 8:48 PM Micah Kornfield <[email protected]> wrote: > +1 > > But practically we don't really intend to wait until 2028 and we're just > > going to start doing this immediately. > > > It looks like all the reference implementations are keeping this as default > to true. I understand some people will use the knob the turn it off but I > expect that people that do so will understand the blast radius of the > change. > > -Micah > > On Tue, Jun 2, 2026 at 7:01 PM Gang Wu <[email protected]> wrote: > > > +1 > > > > While I understand the concerns about playing loose with the spec and the > > practical timeline for ecosystem adoption, I believe Parquet must be bold > > in fixing its own issues, especially when facing internal challenges and > > competition from other formats and scenarios. > > > > This specific change is benign and targets limited use cases. > Furthermore, > > since the PMC will rigorously review every spec change case by case, the > > overall impact is controllable. > > > > Best, > > Gang > > > > On Tue, Jun 2, 2026 at 8:49 PM Andrew Lamb <[email protected]> > wrote: > > > > > +1 > > > > > > While I share the concern that it is non trivial to understand the > > rollout > > > status of various features across the Parquet Ecosystem, I don't think > > this > > > particular change is any better/worse than existing features such as > > > modular encryption, so gating it on getting such a consensus seems > unfair > > > to me > > > > > > I personally think the best thing we can do to help the ecosystem adopt > > > features is continuing to refine our existing matrix of file format > > > implementation status [1] > > > > > > Andrew > > > > > > [1]: > > > > > > > > > https://parquet.apache.org/docs/file-format/implementationstatus/#read-support-by-year > > > > > > > > > On Mon, Jun 1, 2026 at 6:03 PM Daniel Weeks <[email protected]> wrote: > > > > > > > -0 (though on the fence here). > > > > > > > > I'm a little concerned that we're starting to play very loose with > spec > > > > with a change like this. Some of the justification seems to be that > > it's > > > > already supported in some projects and some readers don't break by > > making > > > > this change. > > > > > > > > I see we've added the comment: > > > > > > > > > Writers are encouraged to make the writing of > > > > > * this field optional, but for maximal compatibility should > > default > > > to > > > > > * writing the field until at least September 2028. > > > > > > > > > > > > But practically we don't really intend to wait until 2028 and we're > > just > > > > going to start doing this immediately. > > > > > > > > -Dan > > > > > > > > On Mon, Jun 1, 2026 at 7:12 AM Ed Seidl <[email protected]> wrote: > > > > > > > > > Correction: the link to the PR is > > > > > https://github.com/apache/parquet-format/pull/564 > > > > > > > > > > Ed > > > > > > > > > > On 2026/06/01 14:07:18 Ed Seidl wrote: > > > > > > I would like to propose a vote on adopting the format change > > > described > > > > in > > > > > > GH-563: Make ColumnMetaData.path_in_schema optional. > > > > > > > > > > > > This was proposed on the mailing list [1] and in GitHub issue > #563 > > > [2]. > > > > > > > > > > > > The proposed format specification changes are available in the > PR: > > > > > > https://github.com/apache/parquet-format/issues/563 > > > > > > > > > > > > To verify this design's compatibility and correctness, three PoC > > > > > implementations > > > > > > were developed: > > > > > > 1. Java: https://github.com/apache/parquet-java/pull/3470 > > > > > > 2. C++: https://github.com/apache/arrow/pull/49707 > > > > > > 3. Rust: https://github.com/apache/arrow-rs/pull/9678 > > > > > > > > > > > > All three PoCs have been verified against a test file produced by > > the > > > > > Rust PoC: > > > > > > https://github.com/apache/parquet-testing/pull/108 > > > > > > > > > > > > The vote will be open for at least 72 hours. > > > > > > > > > > > > [ ] +1 Approve the proposed format change > > > > > > [ ] +0 No opinion > > > > > > [ ] -1 Do not approve (please provide specific reasons) > > > > > > > > > > > > Thanks, > > > > > > Ed Seidl > > > > > > > > > > > > [1] > > https://lists.apache.org/thread/900503q07v95vyh6fk3qfn7ynb4w6yn2 > > > > > > [2] https://github.com/apache/parquet-format/issues/563 > > > > > > > > > > > > > > > > > > > > > > > > > > >
