Thanks for the heads up! Yes, I think a formal vote is required before merging the PR.
Best, Gang On Wed, Aug 20, 2025 at 12:36 AM Aihua Xu <aihu...@gmail.com> wrote: > Hi community, > > Let me know if a vote process is needed or we can review in > https://github.com/apache/parquet-format/pull/509 (which is to remove the > under development lines). > > Thanks, > Aihua > > On Mon, Aug 18, 2025 at 10:53 AM Aihua Xu <aihu...@gmail.com> wrote: > > > Hi Micah and community, > > > > We’ve generated the test files from Go (PR #94 > > <https://github.com/apache/parquet-testing/pull/94>) and successfully > > validated them in Parquet-Java (PR #3258 > > <https://github.com/apache/parquet-java/pull/3258>). During testing, we > > identified two minor issues in the Go generation: > > > > 1. > > > > The spec version should be *1* instead of *0*. > > 2. > > > > The Parquet TIME type should be TIME(isAdjustedToUTC=false, MICROS) > > instead of TIME(isAdjustedToUTC=true, MICROS). > > > > These issues have already been addressed by Matt. > > > > Looking ahead, here’s what I propose for closing out the Variant release: > > > > 1. > > > > Start a vote to finalize the Variant spec (removing the two lines > > under *active development*). > > 2. > > > > Start a vote for the Parquet-Java 1.16.0 release. > > > > Please share your thoughts on these next steps, or let me know if you see > > anything else we should address before proceeding. > > > > Thanks, > > Aihua > > > > On Sun, Aug 17, 2025 at 9:28 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > >> > > >> > You want to see if the write path in GO is compatible? Let > >> > me check with Matt on this. > >> > >> > >> Yes, IIUC, I think there are now multiple OSS reader implementations, > that > >> have all been validated against parquet-java writing. So I think it is > >> important we validate a second writer can produce files that can be read > >> by > >> parquet-java. > >> > >> Thanks, > >> Micah > >> > >> On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <aihu...@gmail.com> wrote: > >> > >> > Hi Micah, > >> > > >> > What we have done is to generate a large set of the test cases from > the > >> > Iceberg project and validate in Java and GO. All of those > >> implementations > >> > are independent. You want to see if the write path in GO is > compatible? > >> Let > >> > me check with Matt on this. > >> > > >> > Thanks, > >> > Aihua > >> > > >> > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield < > emkornfi...@gmail.com> > >> > wrote: > >> > > >> > > > > >> > > > We have completed cross-language validation for variant and the > >> > > > implementation compatibility appears solid > >> > > > >> > > > >> > > Great, apologies if I missed it but did we verify Java being able to > >> read > >> > > Go's output? > >> > > > >> > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> wrote: > >> > > > >> > > > We have completed cross-language validation for variant and the > >> > > > implementation compatibility appears solid. Matt has raised some > >> > comments > >> > > > regarding how to handle invalid cases. In fact, we had a long > >> > discussion > >> > > > during the spec development about whether to explicitly define the > >> > > behavior > >> > > > for such cases. We should be able to clear that out soon. > >> > > > > >> > > > > >> > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> wrote: > >> > > > > > >> > > > > Hi Gang, > >> > > > > > >> > > > > Thanks for letting me know. > >> > > > > > >> > > > > Would it make sense to create a new Parquet Java branch that > >> includes > >> > > all > >> > > > > other commits except the Variant type implementation? That way, > we > >> > > could > >> > > > > release a version without Variant entirely. > >> > > > > > >> > > > > We’re eager to get the Geo type released, but at the same time, > we > >> > > don’t > >> > > > > want to rush the Variant work or ship something that’s not fully > >> > ready. > >> > > > > > >> > > > > Thanks, > >> > > > > Jia > >> > > > > > >> > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> > wrote: > >> > > > >> > >> > > > >> parquet-cpp does not implement variant type yet, so it is safe > to > >> > > > release > >> > > > >> the geo types. IIUC, there is no easy way to block users from > >> > > producing > >> > > > >> files with variant types in parquet-java, so this is the main > >> > concern. > >> > > > >> > >> > > > >> Perhaps Aihua can provide an update on the progress? > >> > > > >> > >> > > > >> Best, > >> > > > >> Gang > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> > wrote: > >> > > > >>> > >> > > > >>> Hi all, > >> > > > >>> > >> > > > >>> Thank you for all your hard work on Parquet. > >> > > > >>> > >> > > > >>> Sorry for my ignorance, but I’d like to better understand why > >> the > >> > > > Parquet > >> > > > >>> Java release for Geo types is currently tied to the Variant > type > >> > > work. > >> > > > >>> Arrow C++ (Parquet C++) has already been released with Geo > type > >> > > > support, > >> > > > >>> and it doesn’t seem to have encountered similar issues. > >> > > > >>> > >> > > > >>> The Geo type support in Iceberg has been stalled for several > >> months > >> > > > >> because > >> > > > >>> the Iceberg PMC cannot review or merge the implementation > until > >> > > > there’s a > >> > > > >>> corresponding Parquet Java release. > >> > > > >>> > >> > > > >>> Would it be possible to proceed with a new Parquet Java > release > >> for > >> > > > Geo, > >> > > > >>> and mark the Variant type as experimental or keep it behind a > >> > feature > >> > > > >> flag? > >> > > > >>> > >> > > > >>> I’d really appreciate your thoughts on this and am looking > >> forward > >> > to > >> > > > >> your > >> > > > >>> response. > >> > > > >>> > >> > > > >>> Thanks, > >> > > > >>> Jia > >> > > > >>> > >> > > > >>> > >> > > > >>> > >> > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <aihu...@gmail.com > > > >> > > wrote: > >> > > > >>> > >> > > > >>>> Seems the concern from Gabor is that we should finalize the > >> > Variant > >> > > > >> spec > >> > > > >>> ( > >> > > > >>>> > >> > > > >> > >> > > > >> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > >> > > > >>>> and > >> > > > >>>> > >> > > > >> > >> > > > > >> > > >> > https://github.com/apache/parquet-format/blob/master/VariantShredding.md > >> > > > >>> ), > >> > > > >>>> have a parquet-format release, and then move forward with > >> > > parquet-java > >> > > > >>>> release. I totally agree. > >> > > > >>>> > >> > > > >>>> We should have met the requirement with two reference > >> > > implementations > >> > > > >> for > >> > > > >>>> Variant in open source and I will start a VOTE thread > >> separately > >> > to > >> > > > >> close > >> > > > >>>> out the Variant spec if no objections. > >> > > > >>>> > >> > > > >>>> Thanks for the discussions. > >> > > > >>>> Aihua > >> > > > >>>> > >> > > > >>>> > >> > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb < > >> > andrewlam...@gmail.com > >> > > > > >> > > > >>>> wrote: > >> > > > >>>> > >> > > > >>>>>> At this point, I’d like to check if we have enough > >> > implementation > >> > > > >>>>> coverage > >> > > > >>>>>> to move forward with finalizing the Variant spec. Would it > >> make > >> > > > >> sense > >> > > > >>>> to > >> > > > >>>>>> start a vote thread at this stage? > >> > > > >>>>> > >> > > > >>>>> In my opinion we have sufficient open source implementations > >> (the > >> > > > >>> Golang > >> > > > >>>>> implementation on arrow-go) and a vote to finalize the spec > >> would > >> > > be > >> > > > >>>>> appropriate (and welcome) > >> > > > >>>>> > >> > > > >>>>> From my experience working on the Rust implementation so > far, > >> I > >> > > have > >> > > > >>>> found > >> > > > >>>>> the spec clear and easy to understand, the design well > thought > >> > out, > >> > > > >> and > >> > > > >>>>> have not encountered anything that would require any > changes. > >> > > > >>>>> > >> > > > >>>>> Kudos to the team who designed and wrote the spec for this > >> > feature, > >> > > > >>>>> Andrew > >> > > > >>>>> > >> > > > >>>>> > >> > > > >>>>> > >> > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <ji...@apache.org> > >> wrote: > >> > > > >>>>> > >> > > > >>>>>> Thanks Aihua! > >> > > > >>>>>> > >> > > > >>>>>> The geo type implementation in Iceberg is currently blocked > >> by > >> > > this > >> > > > >>>>>> release. Really looking forward to it. > >> > > > >>>>>> > >> > > > >>>>>> Jia > >> > > > >>>>>> > >> > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < > >> > > > >> ga...@apache.org> > >> > > > >>>>>> wrote: > >> > > > >>>>>> > >> > > > >>>>>>> My concern was related to the current stage of the Variant > >> > > > >>>>> specification > >> > > > >>>>>>> and the fact that we started talking about releasing > >> > parquet-java > >> > > > >>>> with > >> > > > >>>>>>> Variant features. > >> > > > >>>>>>> If we formally release parquet-format with the finalized > >> > Variant > >> > > > >>> spec > >> > > > >>>>>>> first, then I have no concerns about writing Variant > values > >> in > >> > > > >> the > >> > > > >>>>>> upcoming > >> > > > >>>>>>> parquet-java release. Otherwise, we need to block it by > >> default > >> > > > >> and > >> > > > >>>>> mark > >> > > > >>>>>> it > >> > > > >>>>>>> as an experimental feature. > >> > > > >>>>>>> > >> > > > >>>>>>> Cheers, > >> > > > >>>>>>> Gabor > >> > > > >>>>>>> > >> > > > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. > júl. > >> > 16., > >> > > > >>> Sze, > >> > > > >>>>>>> 19:37): > >> > > > >>>>>>> > >> > > > >>>>>>>> Hi Gabor and all, > >> > > > >>>>>>>> > >> > > > >>>>>>>> Here’s my current understanding of the progress on the > >> > > > >> *Variant* > >> > > > >>>>>> support > >> > > > >>>>>>> in > >> > > > >>>>>>>> Parquet: > >> > > > >>>>>>>> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Per Parquet's requirements, we need at least two > >> reference > >> > > > >>>>>>>> implementations to finalize the Variant logical type > >> > > > >>>>> specification. > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> The community is actively working on Java, Go, and Rust > >> > > > >>>>>>> implementations: > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Java already has the encoding and shredding > >> > > > >> implementations > >> > > > >>>> in > >> > > > >>>>>>> place: > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Decoding < > >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Encoding < > >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Shredding Writer > >> > > > >>>>>>>> < > >> https://github.com/apache/parquet-java/issues/3223> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Shredding Reader > >> > > > >>>>>>>> < > >> https://github.com/apache/parquet-java/issues/3211> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Go also includes encoding and shredding support: > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Encoding/Decoding > >> > > > >>>>>>>> <https://github.com/apache/arrow-go/pull/344> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Variant Shredding < > >> > > > >>>>>> https://github.com/apache/arrow-go/pull/434> > >> > > > >>>>>>>> - > >> > > > >>>>>>>> > >> > > > >>>>>>>> Rust is currently working on the shredding > >> > > > >> implementation. > >> > > > >>>>>>>> > >> > > > >>>>>>>> In addition to these, we already have a full Variant > >> > > > >>> implementation > >> > > > >>>>> in > >> > > > >>>>>>>> Apache Iceberg, as well as in some closed-source engines. > >> > > > >>>>>>>> > >> > > > >>>>>>>> At this point, I’d like to check if we have enough > >> > > > >> implementation > >> > > > >>>>>>> coverage > >> > > > >>>>>>>> to move forward with finalizing the Variant spec. Would > it > >> > make > >> > > > >>>> sense > >> > > > >>>>>> to > >> > > > >>>>>>>> start a vote thread at this stage? > >> > > > >>>>>>>> > >> > > > >>>>>>>> Ultimately, our goal is to release a new version of > >> > > > >>> parquet-format > >> > > > >>>>> and > >> > > > >>>>>>>> parquet-java that includes the Variant logical type, so > >> that > >> > > > >>>> Iceberg > >> > > > >>>>>> and > >> > > > >>>>>>>> other engines can officially depend on it and proceed > with > >> > > > >>> further > >> > > > >>>>>>>> implementation. > >> > > > >>>>>>>> > >> > > > >>>>>>>> Let me know your thoughts and how we should proceed. > >> > > > >>>>>>>> > >> > > > >>>>>>>> Thanks, > >> > > > >>>>>>>> > >> > > > >>>>>>>> Aihua > >> > > > >>>>>>>> > >> > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < > >> > > > >>>> ga...@apache.org> > >> > > > >>>>>>>> wrote: > >> > > > >>>>>>>> > >> > > > >>>>>>>>> Hi, > >> > > > >>>>>>>>> > >> > > > >>>>>>>>> I was not able to open the recordings of the last > meeting > >> > > > >>> because > >> > > > >>>>> of > >> > > > >>>>>>>>> permission issues. (Shouldn't these be accessible for > >> > > > >> anyone?) > >> > > > >>>>>>>>> So, I'm not sure if you have talked about this, but the > >> > > > >> Variant > >> > > > >>>>> spec > >> > > > >>>>>> is > >> > > > >>>>>>>>> still not final. Since parquet-java already has Variant > >> > > > >>> support, > >> > > > >>>>> how > >> > > > >>>>>> do > >> > > > >>>>>>>> we > >> > > > >>>>>>>>> prevent writing potentially invalid Variant data with > the > >> > > > >>> proper > >> > > > >>>>>>> logical > >> > > > >>>>>>>>> types we will use for the finalized spec? Is it behind a > >> > > > >>> feature > >> > > > >>>>>> flag? > >> > > > >>>>>>>>> > >> > > > >>>>>>>>> Cheers, > >> > > > >>>>>>>>> Gabor > >> > > > >>>>>>>>> > >> > > > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. > >> júl. > >> > > > >>> 11., > >> > > > >>>> P, > >> > > > >>>>>>>> 19:33): > >> > > > >>>>>>>>> > >> > > > >>>>>>>>>> Hi community, > >> > > > >>>>>>>>>> > >> > > > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd > >> > > > >> like > >> > > > >>> to > >> > > > >>>>>>> proceed > >> > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will > include > >> > > > >>>> support > >> > > > >>>>>> for > >> > > > >>>>>>>>>> *geo-type* and *variant*. > >> > > > >>>>>>>>>> > >> > > > >>>>>>>>>> Please let me know if you have any objections or if you > >> > > > >> have > >> > > > >>>> any > >> > > > >>>>>>>> upcoming > >> > > > >>>>>>>>>> changes you'd like to include in this release. > >> > > > >>>>>>>>>> Thanks, > >> > > > >>>>>>>>>> Aihua > >> > > > >>>>>>>>>> > >> > > > >>>>>>>>> > >> > > > >>>>>>>> > >> > > > >>>>>>> > >> > > > >>>>>> > >> > > > >>>>> > >> > > > >>>> > >> > > > >>> > >> > > > >> > >> > > > > >> > > > >> > > >> > > >