I agree with Gábor. Yesterday, a PR has been merged <https://github.com/apache/parquet-format/commits/dac5a35040ab57000b84246746c5c9cb25267261/src/main/thrift> that also touches the Thrift file. I think the release should be pretty straightforward, and I'm happy to help out with both releases.
Kind regards, Fokko Op ma 25 aug 2025 om 09:00 schreef Gábor Szádovszky <ga...@apache.org>: > I think it would be cleaner to have a parquet-format release with the > finalized spec first. Referencing it in the parquet-java release would > state clearly that it is (supposed to) working according to the finalized > specification. > > Gabor > > Gang Wu <ust...@gmail.com> ezt írta (időpont: 2025. aug. 25., H, 4:48): > > > The vote [1] for finalizing variant spec has passed so it's time to > revive > > this discussion. > > > > I just checked all the commits [2] to parquet-format since the last > release > > and found > > that there is no thrift definition change. All commits are about > > clarification or fixing typos. > > Should we skip the format release and directly jump to the parquet-java > > release? > > > > [1] https://lists.apache.org/thread/mr2voh7twz2hql4y59x5c7o32kntmbvm > > [2] > > > https://github.com/apache/parquet-format/commits/master/?since=2025-03-24 > > > > Best, > > Gang > > > > > > On Wed, Aug 20, 2025 at 9:58 AM Gang Wu <ust...@gmail.com> wrote: > > > > > Thanks for the heads up! > > > > > > Yes, I think a formal vote is required before merging the PR. > > > > > > Best, > > > Gang > > > > > > On Wed, Aug 20, 2025 at 12:36 AM Aihua Xu <aihu...@gmail.com> wrote: > > > > > >> Hi community, > > >> > > >> Let me know if a vote process is needed or we can review in > > >> https://github.com/apache/parquet-format/pull/509 (which is to remove > > the > > >> under development lines). > > >> > > >> Thanks, > > >> Aihua > > >> > > >> On Mon, Aug 18, 2025 at 10:53 AM Aihua Xu <aihu...@gmail.com> wrote: > > >> > > >> > Hi Micah and community, > > >> > > > >> > We’ve generated the test files from Go (PR #94 > > >> > <https://github.com/apache/parquet-testing/pull/94>) and > successfully > > >> > validated them in Parquet-Java (PR #3258 > > >> > <https://github.com/apache/parquet-java/pull/3258>). During > testing, > > we > > >> > identified two minor issues in the Go generation: > > >> > > > >> > 1. > > >> > > > >> > The spec version should be *1* instead of *0*. > > >> > 2. > > >> > > > >> > The Parquet TIME type should be TIME(isAdjustedToUTC=false, > MICROS) > > >> > instead of TIME(isAdjustedToUTC=true, MICROS). > > >> > > > >> > These issues have already been addressed by Matt. > > >> > > > >> > Looking ahead, here’s what I propose for closing out the Variant > > >> release: > > >> > > > >> > 1. > > >> > > > >> > Start a vote to finalize the Variant spec (removing the two lines > > >> > under *active development*). > > >> > 2. > > >> > > > >> > Start a vote for the Parquet-Java 1.16.0 release. > > >> > > > >> > Please share your thoughts on these next steps, or let me know if > you > > >> see > > >> > anything else we should address before proceeding. > > >> > > > >> > Thanks, > > >> > Aihua > > >> > > > >> > On Sun, Aug 17, 2025 at 9:28 PM Micah Kornfield < > > emkornfi...@gmail.com> > > >> > wrote: > > >> > > > >> >> > > > >> >> > You want to see if the write path in GO is compatible? Let > > >> >> > me check with Matt on this. > > >> >> > > >> >> > > >> >> Yes, IIUC, I think there are now multiple OSS reader > implementations, > > >> that > > >> >> have all been validated against parquet-java writing. So I think > it > > is > > >> >> important we validate a second writer can produce files that can be > > >> read > > >> >> by > > >> >> parquet-java. > > >> >> > > >> >> Thanks, > > >> >> Micah > > >> >> > > >> >> On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <aihu...@gmail.com> > wrote: > > >> >> > > >> >> > Hi Micah, > > >> >> > > > >> >> > What we have done is to generate a large set of the test cases > from > > >> the > > >> >> > Iceberg project and validate in Java and GO. All of those > > >> >> implementations > > >> >> > are independent. You want to see if the write path in GO is > > >> compatible? > > >> >> Let > > >> >> > me check with Matt on this. > > >> >> > > > >> >> > Thanks, > > >> >> > Aihua > > >> >> > > > >> >> > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield < > > >> emkornfi...@gmail.com> > > >> >> > wrote: > > >> >> > > > >> >> > > > > > >> >> > > > We have completed cross-language validation for variant and > the > > >> >> > > > implementation compatibility appears solid > > >> >> > > > > >> >> > > > > >> >> > > Great, apologies if I missed it but did we verify Java being > able > > >> to > > >> >> read > > >> >> > > Go's output? > > >> >> > > > > >> >> > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> > > wrote: > > >> >> > > > > >> >> > > > We have completed cross-language validation for variant and > the > > >> >> > > > implementation compatibility appears solid. Matt has raised > > some > > >> >> > comments > > >> >> > > > regarding how to handle invalid cases. In fact, we had a long > > >> >> > discussion > > >> >> > > > during the spec development about whether to explicitly > define > > >> the > > >> >> > > behavior > > >> >> > > > for such cases. We should be able to clear that out soon. > > >> >> > > > > > >> >> > > > > > >> >> > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> > wrote: > > >> >> > > > > > > >> >> > > > > Hi Gang, > > >> >> > > > > > > >> >> > > > > Thanks for letting me know. > > >> >> > > > > > > >> >> > > > > Would it make sense to create a new Parquet Java branch > that > > >> >> includes > > >> >> > > all > > >> >> > > > > other commits except the Variant type implementation? That > > >> way, we > > >> >> > > could > > >> >> > > > > release a version without Variant entirely. > > >> >> > > > > > > >> >> > > > > We’re eager to get the Geo type released, but at the same > > >> time, we > > >> >> > > don’t > > >> >> > > > > want to rush the Variant work or ship something that’s not > > >> fully > > >> >> > ready. > > >> >> > > > > > > >> >> > > > > Thanks, > > >> >> > > > > Jia > > >> >> > > > > > > >> >> > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> > > >> wrote: > > >> >> > > > >> > > >> >> > > > >> parquet-cpp does not implement variant type yet, so it is > > >> safe to > > >> >> > > > release > > >> >> > > > >> the geo types. IIUC, there is no easy way to block users > > from > > >> >> > > producing > > >> >> > > > >> files with variant types in parquet-java, so this is the > > main > > >> >> > concern. > > >> >> > > > >> > > >> >> > > > >> Perhaps Aihua can provide an update on the progress? > > >> >> > > > >> > > >> >> > > > >> Best, > > >> >> > > > >> Gang > > >> >> > > > >> > > >> >> > > > >> > > >> >> > > > >> > > >> >> > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> > > >> wrote: > > >> >> > > > >>> > > >> >> > > > >>> Hi all, > > >> >> > > > >>> > > >> >> > > > >>> Thank you for all your hard work on Parquet. > > >> >> > > > >>> > > >> >> > > > >>> Sorry for my ignorance, but I’d like to better understand > > why > > >> >> the > > >> >> > > > Parquet > > >> >> > > > >>> Java release for Geo types is currently tied to the > Variant > > >> type > > >> >> > > work. > > >> >> > > > >>> Arrow C++ (Parquet C++) has already been released with > Geo > > >> type > > >> >> > > > support, > > >> >> > > > >>> and it doesn’t seem to have encountered similar issues. > > >> >> > > > >>> > > >> >> > > > >>> The Geo type support in Iceberg has been stalled for > > several > > >> >> months > > >> >> > > > >> because > > >> >> > > > >>> the Iceberg PMC cannot review or merge the implementation > > >> until > > >> >> > > > there’s a > > >> >> > > > >>> corresponding Parquet Java release. > > >> >> > > > >>> > > >> >> > > > >>> Would it be possible to proceed with a new Parquet Java > > >> release > > >> >> for > > >> >> > > > Geo, > > >> >> > > > >>> and mark the Variant type as experimental or keep it > > behind a > > >> >> > feature > > >> >> > > > >> flag? > > >> >> > > > >>> > > >> >> > > > >>> I’d really appreciate your thoughts on this and am > looking > > >> >> forward > > >> >> > to > > >> >> > > > >> your > > >> >> > > > >>> response. > > >> >> > > > >>> > > >> >> > > > >>> Thanks, > > >> >> > > > >>> Jia > > >> >> > > > >>> > > >> >> > > > >>> > > >> >> > > > >>> > > >> >> > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu < > > >> aihu...@gmail.com> > > >> >> > > wrote: > > >> >> > > > >>> > > >> >> > > > >>>> Seems the concern from Gabor is that we should finalize > > the > > >> >> > Variant > > >> >> > > > >> spec > > >> >> > > > >>> ( > > >> >> > > > >>>> > > >> >> > > > >> > > >> >> > > > > >> >> > > >> > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > > >> >> > > > >>>> and > > >> >> > > > >>>> > > >> >> > > > >> > > >> >> > > > > > >> >> > > > >> >> > > >> > > https://github.com/apache/parquet-format/blob/master/VariantShredding.md > > >> >> > > > >>> ), > > >> >> > > > >>>> have a parquet-format release, and then move forward > with > > >> >> > > parquet-java > > >> >> > > > >>>> release. I totally agree. > > >> >> > > > >>>> > > >> >> > > > >>>> We should have met the requirement with two reference > > >> >> > > implementations > > >> >> > > > >> for > > >> >> > > > >>>> Variant in open source and I will start a VOTE thread > > >> >> separately > > >> >> > to > > >> >> > > > >> close > > >> >> > > > >>>> out the Variant spec if no objections. > > >> >> > > > >>>> > > >> >> > > > >>>> Thanks for the discussions. > > >> >> > > > >>>> Aihua > > >> >> > > > >>>> > > >> >> > > > >>>> > > >> >> > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb < > > >> >> > andrewlam...@gmail.com > > >> >> > > > > > >> >> > > > >>>> wrote: > > >> >> > > > >>>> > > >> >> > > > >>>>>> At this point, I’d like to check if we have enough > > >> >> > implementation > > >> >> > > > >>>>> coverage > > >> >> > > > >>>>>> to move forward with finalizing the Variant spec. > Would > > it > > >> >> make > > >> >> > > > >> sense > > >> >> > > > >>>> to > > >> >> > > > >>>>>> start a vote thread at this stage? > > >> >> > > > >>>>> > > >> >> > > > >>>>> In my opinion we have sufficient open source > > >> implementations > > >> >> (the > > >> >> > > > >>> Golang > > >> >> > > > >>>>> implementation on arrow-go) and a vote to finalize the > > spec > > >> >> would > > >> >> > > be > > >> >> > > > >>>>> appropriate (and welcome) > > >> >> > > > >>>>> > > >> >> > > > >>>>> From my experience working on the Rust implementation > so > > >> far, > > >> >> I > > >> >> > > have > > >> >> > > > >>>> found > > >> >> > > > >>>>> the spec clear and easy to understand, the design well > > >> thought > > >> >> > out, > > >> >> > > > >> and > > >> >> > > > >>>>> have not encountered anything that would require any > > >> changes. > > >> >> > > > >>>>> > > >> >> > > > >>>>> Kudos to the team who designed and wrote the spec for > > this > > >> >> > feature, > > >> >> > > > >>>>> Andrew > > >> >> > > > >>>>> > > >> >> > > > >>>>> > > >> >> > > > >>>>> > > >> >> > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu < > ji...@apache.org > > > > > >> >> wrote: > > >> >> > > > >>>>> > > >> >> > > > >>>>>> Thanks Aihua! > > >> >> > > > >>>>>> > > >> >> > > > >>>>>> The geo type implementation in Iceberg is currently > > >> blocked > > >> >> by > > >> >> > > this > > >> >> > > > >>>>>> release. Really looking forward to it. > > >> >> > > > >>>>>> > > >> >> > > > >>>>>> Jia > > >> >> > > > >>>>>> > > >> >> > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < > > >> >> > > > >> ga...@apache.org> > > >> >> > > > >>>>>> wrote: > > >> >> > > > >>>>>> > > >> >> > > > >>>>>>> My concern was related to the current stage of the > > >> Variant > > >> >> > > > >>>>> specification > > >> >> > > > >>>>>>> and the fact that we started talking about releasing > > >> >> > parquet-java > > >> >> > > > >>>> with > > >> >> > > > >>>>>>> Variant features. > > >> >> > > > >>>>>>> If we formally release parquet-format with the > > finalized > > >> >> > Variant > > >> >> > > > >>> spec > > >> >> > > > >>>>>>> first, then I have no concerns about writing Variant > > >> values > > >> >> in > > >> >> > > > >> the > > >> >> > > > >>>>>> upcoming > > >> >> > > > >>>>>>> parquet-java release. Otherwise, we need to block it > by > > >> >> default > > >> >> > > > >> and > > >> >> > > > >>>>> mark > > >> >> > > > >>>>>> it > > >> >> > > > >>>>>>> as an experimental feature. > > >> >> > > > >>>>>>> > > >> >> > > > >>>>>>> Cheers, > > >> >> > > > >>>>>>> Gabor > > >> >> > > > >>>>>>> > > >> >> > > > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: > 2025. > > >> júl. > > >> >> > 16., > > >> >> > > > >>> Sze, > > >> >> > > > >>>>>>> 19:37): > > >> >> > > > >>>>>>> > > >> >> > > > >>>>>>>> Hi Gabor and all, > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Here’s my current understanding of the progress on > the > > >> >> > > > >> *Variant* > > >> >> > > > >>>>>> support > > >> >> > > > >>>>>>> in > > >> >> > > > >>>>>>>> Parquet: > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Per Parquet's requirements, we need at least two > > >> >> reference > > >> >> > > > >>>>>>>> implementations to finalize the Variant logical > type > > >> >> > > > >>>>> specification. > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> The community is actively working on Java, Go, and > > >> Rust > > >> >> > > > >>>>>>> implementations: > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Java already has the encoding and shredding > > >> >> > > > >> implementations > > >> >> > > > >>>> in > > >> >> > > > >>>>>>> place: > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Decoding < > > >> >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Encoding < > > >> >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Shredding Writer > > >> >> > > > >>>>>>>> < > > >> >> https://github.com/apache/parquet-java/issues/3223> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Shredding Reader > > >> >> > > > >>>>>>>> < > > >> >> https://github.com/apache/parquet-java/issues/3211> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Go also includes encoding and shredding > support: > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Encoding/Decoding > > >> >> > > > >>>>>>>> < > https://github.com/apache/arrow-go/pull/344> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Variant Shredding < > > >> >> > > > >>>>>> https://github.com/apache/arrow-go/pull/434> > > >> >> > > > >>>>>>>> - > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Rust is currently working on the shredding > > >> >> > > > >> implementation. > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> In addition to these, we already have a full Variant > > >> >> > > > >>> implementation > > >> >> > > > >>>>> in > > >> >> > > > >>>>>>>> Apache Iceberg, as well as in some closed-source > > >> engines. > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> At this point, I’d like to check if we have enough > > >> >> > > > >> implementation > > >> >> > > > >>>>>>> coverage > > >> >> > > > >>>>>>>> to move forward with finalizing the Variant spec. > > Would > > >> it > > >> >> > make > > >> >> > > > >>>> sense > > >> >> > > > >>>>>> to > > >> >> > > > >>>>>>>> start a vote thread at this stage? > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Ultimately, our goal is to release a new version of > > >> >> > > > >>> parquet-format > > >> >> > > > >>>>> and > > >> >> > > > >>>>>>>> parquet-java that includes the Variant logical type, > > so > > >> >> that > > >> >> > > > >>>> Iceberg > > >> >> > > > >>>>>> and > > >> >> > > > >>>>>>>> other engines can officially depend on it and > proceed > > >> with > > >> >> > > > >>> further > > >> >> > > > >>>>>>>> implementation. > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Let me know your thoughts and how we should proceed. > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Thanks, > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> Aihua > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < > > >> >> > > > >>>> ga...@apache.org> > > >> >> > > > >>>>>>>> wrote: > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>>>> Hi, > > >> >> > > > >>>>>>>>> > > >> >> > > > >>>>>>>>> I was not able to open the recordings of the last > > >> meeting > > >> >> > > > >>> because > > >> >> > > > >>>>> of > > >> >> > > > >>>>>>>>> permission issues. (Shouldn't these be accessible > for > > >> >> > > > >> anyone?) > > >> >> > > > >>>>>>>>> So, I'm not sure if you have talked about this, but > > the > > >> >> > > > >> Variant > > >> >> > > > >>>>> spec > > >> >> > > > >>>>>> is > > >> >> > > > >>>>>>>>> still not final. Since parquet-java already has > > Variant > > >> >> > > > >>> support, > > >> >> > > > >>>>> how > > >> >> > > > >>>>>> do > > >> >> > > > >>>>>>>> we > > >> >> > > > >>>>>>>>> prevent writing potentially invalid Variant data > with > > >> the > > >> >> > > > >>> proper > > >> >> > > > >>>>>>> logical > > >> >> > > > >>>>>>>>> types we will use for the finalized spec? Is it > > behind > > >> a > > >> >> > > > >>> feature > > >> >> > > > >>>>>> flag? > > >> >> > > > >>>>>>>>> > > >> >> > > > >>>>>>>>> Cheers, > > >> >> > > > >>>>>>>>> Gabor > > >> >> > > > >>>>>>>>> > > >> >> > > > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: > > 2025. > > >> >> júl. > > >> >> > > > >>> 11., > > >> >> > > > >>>> P, > > >> >> > > > >>>>>>>> 19:33): > > >> >> > > > >>>>>>>>> > > >> >> > > > >>>>>>>>>> Hi community, > > >> >> > > > >>>>>>>>>> > > >> >> > > > >>>>>>>>>> As discussed in the last community sync-up > meeting, > > >> I'd > > >> >> > > > >> like > > >> >> > > > >>> to > > >> >> > > > >>>>>>> proceed > > >> >> > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will > > >> include > > >> >> > > > >>>> support > > >> >> > > > >>>>>> for > > >> >> > > > >>>>>>>>>> *geo-type* and *variant*. > > >> >> > > > >>>>>>>>>> > > >> >> > > > >>>>>>>>>> Please let me know if you have any objections or > if > > >> you > > >> >> > > > >> have > > >> >> > > > >>>> any > > >> >> > > > >>>>>>>> upcoming > > >> >> > > > >>>>>>>>>> changes you'd like to include in this release. > > >> >> > > > >>>>>>>>>> Thanks, > > >> >> > > > >>>>>>>>>> Aihua > > >> >> > > > >>>>>>>>>> > > >> >> > > > >>>>>>>>> > > >> >> > > > >>>>>>>> > > >> >> > > > >>>>>>> > > >> >> > > > >>>>>> > > >> >> > > > >>>>> > > >> >> > > > >>>> > > >> >> > > > >>> > > >> >> > > > >> > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> > > > >> > > > > > >