> > We have completed cross-language validation for variant and the > implementation compatibility appears solid
Great, apologies if I missed it but did we verify Java being able to read Go's output? On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> wrote: > We have completed cross-language validation for variant and the > implementation compatibility appears solid. Matt has raised some comments > regarding how to handle invalid cases. In fact, we had a long discussion > during the spec development about whether to explicitly define the behavior > for such cases. We should be able to clear that out soon. > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> wrote: > > > > Hi Gang, > > > > Thanks for letting me know. > > > > Would it make sense to create a new Parquet Java branch that includes all > > other commits except the Variant type implementation? That way, we could > > release a version without Variant entirely. > > > > We’re eager to get the Geo type released, but at the same time, we don’t > > want to rush the Variant work or ship something that’s not fully ready. > > > > Thanks, > > Jia > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> wrote: > >> > >> parquet-cpp does not implement variant type yet, so it is safe to > release > >> the geo types. IIUC, there is no easy way to block users from producing > >> files with variant types in parquet-java, so this is the main concern. > >> > >> Perhaps Aihua can provide an update on the progress? > >> > >> Best, > >> Gang > >> > >> > >> > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> wrote: > >>> > >>> Hi all, > >>> > >>> Thank you for all your hard work on Parquet. > >>> > >>> Sorry for my ignorance, but I’d like to better understand why the > Parquet > >>> Java release for Geo types is currently tied to the Variant type work. > >>> Arrow C++ (Parquet C++) has already been released with Geo type > support, > >>> and it doesn’t seem to have encountered similar issues. > >>> > >>> The Geo type support in Iceberg has been stalled for several months > >> because > >>> the Iceberg PMC cannot review or merge the implementation until > there’s a > >>> corresponding Parquet Java release. > >>> > >>> Would it be possible to proceed with a new Parquet Java release for > Geo, > >>> and mark the Variant type as experimental or keep it behind a feature > >> flag? > >>> > >>> I’d really appreciate your thoughts on this and am looking forward to > >> your > >>> response. > >>> > >>> Thanks, > >>> Jia > >>> > >>> > >>> > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <aihu...@gmail.com> wrote: > >>> > >>>> Seems the concern from Gabor is that we should finalize the Variant > >> spec > >>> ( > >>>> > >> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md > >>>> and > >>>> > >> > https://github.com/apache/parquet-format/blob/master/VariantShredding.md > >>> ), > >>>> have a parquet-format release, and then move forward with parquet-java > >>>> release. I totally agree. > >>>> > >>>> We should have met the requirement with two reference implementations > >> for > >>>> Variant in open source and I will start a VOTE thread separately to > >> close > >>>> out the Variant spec if no objections. > >>>> > >>>> Thanks for the discussions. > >>>> Aihua > >>>> > >>>> > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <andrewlam...@gmail.com> > >>>> wrote: > >>>> > >>>>>> At this point, I’d like to check if we have enough implementation > >>>>> coverage > >>>>>> to move forward with finalizing the Variant spec. Would it make > >> sense > >>>> to > >>>>>> start a vote thread at this stage? > >>>>> > >>>>> In my opinion we have sufficient open source implementations (the > >>> Golang > >>>>> implementation on arrow-go) and a vote to finalize the spec would be > >>>>> appropriate (and welcome) > >>>>> > >>>>> From my experience working on the Rust implementation so far, I have > >>>> found > >>>>> the spec clear and easy to understand, the design well thought out, > >> and > >>>>> have not encountered anything that would require any changes. > >>>>> > >>>>> Kudos to the team who designed and wrote the spec for this feature, > >>>>> Andrew > >>>>> > >>>>> > >>>>> > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <ji...@apache.org> wrote: > >>>>> > >>>>>> Thanks Aihua! > >>>>>> > >>>>>> The geo type implementation in Iceberg is currently blocked by this > >>>>>> release. Really looking forward to it. > >>>>>> > >>>>>> Jia > >>>>>> > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky < > >> ga...@apache.org> > >>>>>> wrote: > >>>>>> > >>>>>>> My concern was related to the current stage of the Variant > >>>>> specification > >>>>>>> and the fact that we started talking about releasing parquet-java > >>>> with > >>>>>>> Variant features. > >>>>>>> If we formally release parquet-format with the finalized Variant > >>> spec > >>>>>>> first, then I have no concerns about writing Variant values in > >> the > >>>>>> upcoming > >>>>>>> parquet-java release. Otherwise, we need to block it by default > >> and > >>>>> mark > >>>>>> it > >>>>>>> as an experimental feature. > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Gabor > >>>>>>> > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl. 16., > >>> Sze, > >>>>>>> 19:37): > >>>>>>> > >>>>>>>> Hi Gabor and all, > >>>>>>>> > >>>>>>>> Here’s my current understanding of the progress on the > >> *Variant* > >>>>>> support > >>>>>>> in > >>>>>>>> Parquet: > >>>>>>>> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Per Parquet's requirements, we need at least two reference > >>>>>>>> implementations to finalize the Variant logical type > >>>>> specification. > >>>>>>>> - > >>>>>>>> > >>>>>>>> The community is actively working on Java, Go, and Rust > >>>>>>> implementations: > >>>>>>>> - > >>>>>>>> > >>>>>>>> Java already has the encoding and shredding > >> implementations > >>>> in > >>>>>>> place: > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Decoding < > >>>>>>>> https://github.com/apache/parquet-java/pull/3197> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Encoding < > >>>>>>>> https://github.com/apache/parquet-java/pull/3202> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Shredding Writer > >>>>>>>> <https://github.com/apache/parquet-java/issues/3223> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Shredding Reader > >>>>>>>> <https://github.com/apache/parquet-java/issues/3211> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Go also includes encoding and shredding support: > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Encoding/Decoding > >>>>>>>> <https://github.com/apache/arrow-go/pull/344> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Variant Shredding < > >>>>>> https://github.com/apache/arrow-go/pull/434> > >>>>>>>> - > >>>>>>>> > >>>>>>>> Rust is currently working on the shredding > >> implementation. > >>>>>>>> > >>>>>>>> In addition to these, we already have a full Variant > >>> implementation > >>>>> in > >>>>>>>> Apache Iceberg, as well as in some closed-source engines. > >>>>>>>> > >>>>>>>> At this point, I’d like to check if we have enough > >> implementation > >>>>>>> coverage > >>>>>>>> to move forward with finalizing the Variant spec. Would it make > >>>> sense > >>>>>> to > >>>>>>>> start a vote thread at this stage? > >>>>>>>> > >>>>>>>> Ultimately, our goal is to release a new version of > >>> parquet-format > >>>>> and > >>>>>>>> parquet-java that includes the Variant logical type, so that > >>>> Iceberg > >>>>>> and > >>>>>>>> other engines can officially depend on it and proceed with > >>> further > >>>>>>>> implementation. > >>>>>>>> > >>>>>>>> Let me know your thoughts and how we should proceed. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Aihua > >>>>>>>> > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky < > >>>> ga...@apache.org> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I was not able to open the recordings of the last meeting > >>> because > >>>>> of > >>>>>>>>> permission issues. (Shouldn't these be accessible for > >> anyone?) > >>>>>>>>> So, I'm not sure if you have talked about this, but the > >> Variant > >>>>> spec > >>>>>> is > >>>>>>>>> still not final. Since parquet-java already has Variant > >>> support, > >>>>> how > >>>>>> do > >>>>>>>> we > >>>>>>>>> prevent writing potentially invalid Variant data with the > >>> proper > >>>>>>> logical > >>>>>>>>> types we will use for the finalized spec? Is it behind a > >>> feature > >>>>>> flag? > >>>>>>>>> > >>>>>>>>> Cheers, > >>>>>>>>> Gabor > >>>>>>>>> > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl. > >>> 11., > >>>> P, > >>>>>>>> 19:33): > >>>>>>>>> > >>>>>>>>>> Hi community, > >>>>>>>>>> > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd > >> like > >>> to > >>>>>>> proceed > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include > >>>> support > >>>>>> for > >>>>>>>>>> *geo-type* and *variant*. > >>>>>>>>>> > >>>>>>>>>> Please let me know if you have any objections or if you > >> have > >>>> any > >>>>>>>> upcoming > >>>>>>>>>> changes you'd like to include in this release. > >>>>>>>>>> Thanks, > >>>>>>>>>> Aihua > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> >