Hi Micah and community, We’ve generated the test files from Go (PR #94 <https://github.com/apache/parquet-testing/pull/94>) and successfully validated them in Parquet-Java (PR #3258 <https://github.com/apache/parquet-java/pull/3258>). During testing, we identified two minor issues in the Go generation:
1. The spec version should be *1* instead of *0*. 2. The Parquet TIME type should be TIME(isAdjustedToUTC=false, MICROS) instead of TIME(isAdjustedToUTC=true, MICROS). These issues have already been addressed by Matt. Looking ahead, here’s what I propose for closing out the Variant release: 1. Start a vote to finalize the Variant spec (removing the two lines under *active development*). 2. Start a vote for the Parquet-Java 1.16.0 release. Please share your thoughts on these next steps, or let me know if you see anything else we should address before proceeding. Thanks, Aihua On Sun, Aug 17, 2025 at 9:28 PM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > You want to see if the write path in GO is compatible? Let > > me check with Matt on this. > > > Yes, IIUC, I think there are now multiple OSS reader implementations, that > have all been validated against parquet-java writing. So I think it is > important we validate a second writer can produce files that can be read by > parquet-java. > > Thanks, > Micah > > On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <aihu...@gmail.com> wrote: > > > Hi Micah, > > > > What we have done is to generate a large set of the test cases from the > > Iceberg project and validate in Java and GO. All of those implementations > > are independent. You want to see if the write path in GO is compatible? > Let > > me check with Matt on this. > > > > Thanks, > > Aihua > > > > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield <emkornfi...@gmail.com> > > wrote: > > > > > > > > > > We have completed cross-language validation for variant and the > > > > implementation compatibility appears solid > > > > > > > > > Great, apologies if I missed it but did we verify Java being able to > read > > > Go's output? > > > > > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> wrote: > > > > > > > We have completed cross-language validation for variant and the > > > > implementation compatibility appears solid. Matt has raised some > > comments > > > > regarding how to handle invalid cases. In fact, we had a long > > discussion > > > > during the spec development about whether to explicitly define the > > > behavior > > > > for such cases. We should be able to clear that out soon. > > > > > > > > > > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> wrote: > > > > > > > > > > Hi Gang, > > > > > > > > > > Thanks for letting me know. > > > > > > > > > > Would it make sense to create a new Parquet Java branch that > includes > > > all > > > > > other commits except the Variant type implementation? That way, we > > > could > > > > > release a version without Variant entirely. > > > > > > > > > > We’re eager to get the Geo type released, but at the same time, we > > > don’t > > > > > want to rush the Variant work or ship something that’s not fully > > ready. > > > > > > > > > > Thanks, > > > > > Jia > > > > > > > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> wrote: > > > > >> > > > > >> parquet-cpp does not implement variant type yet, so it is safe to > > > > release > > > > >> the geo types. IIUC, there is no easy way to block users from > > > producing > > > > >> files with variant types in parquet-java, so this is the main > > concern. > > > > >> > > > > >> Perhaps Aihua can provide an update on the progress? > > > > >> > > > > >> Best, > > > > >> Gang > > > > >> > > > > >> > > > > >> > > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> wrote: > > > > >>> > > > > >>> Hi all, > > > > >>> > > > > >>> Thank you for all your hard work on Parquet. > > > > >>> > > > > >>> Sorry for my ignorance, but I’d like to better understand why the > > > > Parquet > > > > >>> Java release for Geo types is currently tied to the Variant type > > > work. > > > > >>> Arrow C++ (Parquet C++) has already been released with Geo type > > > > support, > > > > >>> and it doesn’t seem to have encountered similar issues. > > > > >>> > > > > >>> The Geo type support in Iceberg has been stalled for several > months > > > > >> because > > > > >>> the Iceberg PMC cannot review or merge the implementation until > > > > there’s a > > > > >>> corresponding Parquet Java release. > > > > >>> > > > > >>> Would it be possible to proceed with a new Parquet Java release > for > > > > Geo, > > > > >>> and mark the Variant type as experimental or keep it behind a > > feature > > > > >> flag? > > > > >>> > > > > >>> I’d really appreciate your thoughts on this and am looking > forward > > to > > > > >> your > > > > >>> response. > > > > >>> > > > > >>> Thanks, > > > > >>> Jia > > > > >>> > > > > >>> > > > > >>> > > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <aihu...@gmail.com> > > > wrote: > > > > >>> > > > > >>>> Seems the concern from Gabor is that we should finalize the > > Variant > > > > >> spec > > > > >>> ( > > > > >>>> > > > > >> > > > > https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > > > > >>>> and > > > > >>>> > > > > >> > > > > > > https://github.com/apache/parquet-format/blob/master/VariantShredding.md > > > > >>> ), > > > > >>>> have a parquet-format release, and then move forward with > > > parquet-java > > > > >>>> release. I totally agree. > > > > >>>> > > > > >>>> We should have met the requirement with two reference > > > implementations > > > > >> for > > > > >>>> Variant in open source and I will start a VOTE thread separately > > to > > > > >> close > > > > >>>> out the Variant spec if no objections. > > > > >>>> > > > > >>>> Thanks for the discussions. > > > > >>>> Aihua > > > > >>>> > > > > >>>> > > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb < > > andrewlam...@gmail.com > > > > > > > > >>>> wrote: > > > > >>>> > > > > >>>>>> At this point, I’d like to check if we have enough > > implementation > > > > >>>>> coverage > > > > >>>>>> to move forward with finalizing the Variant spec. Would it > make > > > > >> sense > > > > >>>> to > > > > >>>>>> start a vote thread at this stage? > > > > >>>>> > > > > >>>>> In my opinion we have sufficient open source implementations > (the > > > > >>> Golang > > > > >>>>> implementation on arrow-go) and a vote to finalize the spec > would > > > be > > > > >>>>> appropriate (and welcome) > > > > >>>>> > > > > >>>>> From my experience working on the Rust implementation so far, I > > > have > > > > >>>> found > > > > >>>>> the spec clear and easy to understand, the design well thought > > out, > > > > >> and > > > > >>>>> have not encountered anything that would require any changes. > > > > >>>>> > > > > >>>>> Kudos to the team who designed and wrote the spec for this > > feature, > > > > >>>>> Andrew > > > > >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <ji...@apache.org> > wrote: > > > > >>>>> > > > > >>>>>> Thanks Aihua! > > > > >>>>>> > > > > >>>>>> The geo type implementation in Iceberg is currently blocked by > > > this > > > > >>>>>> release. Really looking forward to it. > > > > >>>>>> > > > > >>>>>> Jia > > > > >>>>>> > > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < > > > > >> ga...@apache.org> > > > > >>>>>> wrote: > > > > >>>>>> > > > > >>>>>>> My concern was related to the current stage of the Variant > > > > >>>>> specification > > > > >>>>>>> and the fact that we started talking about releasing > > parquet-java > > > > >>>> with > > > > >>>>>>> Variant features. > > > > >>>>>>> If we formally release parquet-format with the finalized > > Variant > > > > >>> spec > > > > >>>>>>> first, then I have no concerns about writing Variant values > in > > > > >> the > > > > >>>>>> upcoming > > > > >>>>>>> parquet-java release. Otherwise, we need to block it by > default > > > > >> and > > > > >>>>> mark > > > > >>>>>> it > > > > >>>>>>> as an experimental feature. > > > > >>>>>>> > > > > >>>>>>> Cheers, > > > > >>>>>>> Gabor > > > > >>>>>>> > > > > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl. > > 16., > > > > >>> Sze, > > > > >>>>>>> 19:37): > > > > >>>>>>> > > > > >>>>>>>> Hi Gabor and all, > > > > >>>>>>>> > > > > >>>>>>>> Here’s my current understanding of the progress on the > > > > >> *Variant* > > > > >>>>>> support > > > > >>>>>>> in > > > > >>>>>>>> Parquet: > > > > >>>>>>>> > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Per Parquet's requirements, we need at least two reference > > > > >>>>>>>> implementations to finalize the Variant logical type > > > > >>>>> specification. > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> The community is actively working on Java, Go, and Rust > > > > >>>>>>> implementations: > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Java already has the encoding and shredding > > > > >> implementations > > > > >>>> in > > > > >>>>>>> place: > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Decoding < > > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197> > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Encoding < > > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202> > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Shredding Writer > > > > >>>>>>>> <https://github.com/apache/parquet-java/issues/3223 > > > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Shredding Reader > > > > >>>>>>>> <https://github.com/apache/parquet-java/issues/3211 > > > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Go also includes encoding and shredding support: > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Encoding/Decoding > > > > >>>>>>>> <https://github.com/apache/arrow-go/pull/344> > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Variant Shredding < > > > > >>>>>> https://github.com/apache/arrow-go/pull/434> > > > > >>>>>>>> - > > > > >>>>>>>> > > > > >>>>>>>> Rust is currently working on the shredding > > > > >> implementation. > > > > >>>>>>>> > > > > >>>>>>>> In addition to these, we already have a full Variant > > > > >>> implementation > > > > >>>>> in > > > > >>>>>>>> Apache Iceberg, as well as in some closed-source engines. > > > > >>>>>>>> > > > > >>>>>>>> At this point, I’d like to check if we have enough > > > > >> implementation > > > > >>>>>>> coverage > > > > >>>>>>>> to move forward with finalizing the Variant spec. Would it > > make > > > > >>>> sense > > > > >>>>>> to > > > > >>>>>>>> start a vote thread at this stage? > > > > >>>>>>>> > > > > >>>>>>>> Ultimately, our goal is to release a new version of > > > > >>> parquet-format > > > > >>>>> and > > > > >>>>>>>> parquet-java that includes the Variant logical type, so that > > > > >>>> Iceberg > > > > >>>>>> and > > > > >>>>>>>> other engines can officially depend on it and proceed with > > > > >>> further > > > > >>>>>>>> implementation. > > > > >>>>>>>> > > > > >>>>>>>> Let me know your thoughts and how we should proceed. > > > > >>>>>>>> > > > > >>>>>>>> Thanks, > > > > >>>>>>>> > > > > >>>>>>>> Aihua > > > > >>>>>>>> > > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < > > > > >>>> ga...@apache.org> > > > > >>>>>>>> wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hi, > > > > >>>>>>>>> > > > > >>>>>>>>> I was not able to open the recordings of the last meeting > > > > >>> because > > > > >>>>> of > > > > >>>>>>>>> permission issues. (Shouldn't these be accessible for > > > > >> anyone?) > > > > >>>>>>>>> So, I'm not sure if you have talked about this, but the > > > > >> Variant > > > > >>>>> spec > > > > >>>>>> is > > > > >>>>>>>>> still not final. Since parquet-java already has Variant > > > > >>> support, > > > > >>>>> how > > > > >>>>>> do > > > > >>>>>>>> we > > > > >>>>>>>>> prevent writing potentially invalid Variant data with the > > > > >>> proper > > > > >>>>>>> logical > > > > >>>>>>>>> types we will use for the finalized spec? Is it behind a > > > > >>> feature > > > > >>>>>> flag? > > > > >>>>>>>>> > > > > >>>>>>>>> Cheers, > > > > >>>>>>>>> Gabor > > > > >>>>>>>>> > > > > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl. > > > > >>> 11., > > > > >>>> P, > > > > >>>>>>>> 19:33): > > > > >>>>>>>>> > > > > >>>>>>>>>> Hi community, > > > > >>>>>>>>>> > > > > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd > > > > >> like > > > > >>> to > > > > >>>>>>> proceed > > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include > > > > >>>> support > > > > >>>>>> for > > > > >>>>>>>>>> *geo-type* and *variant*. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Please let me know if you have any objections or if you > > > > >> have > > > > >>>> any > > > > >>>>>>>> upcoming > > > > >>>>>>>>>> changes you'd like to include in this release. > > > > >>>>>>>>>> Thanks, > > > > >>>>>>>>>> Aihua > > > > >>>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>> > > > > >>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >>> > > > > >> > > > > > > > > > >