>
> We have completed cross-language validation for variant and the
> implementation compatibility appears solid


Great, apologies if I missed it but did we verify Java being able to read
Go's output?

On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> wrote:

> We have completed cross-language validation for variant and the
> implementation compatibility appears solid. Matt has raised some comments
> regarding how to handle invalid cases. In fact, we had a long discussion
> during the spec development about whether to explicitly define the behavior
> for such cases. We should be able to clear that out soon.
>
>
> > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> wrote:
> >
> > Hi Gang,
> >
> > Thanks for letting me know.
> >
> > Would it make sense to create a new Parquet Java branch that includes all
> > other commits except the Variant type implementation? That way, we could
> > release a version without Variant entirely.
> >
> > We’re eager to get the Geo type released, but at the same time, we don’t
> > want to rush the Variant work or ship something that’s not fully ready.
> >
> > Thanks,
> > Jia
> >
> >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> wrote:
> >>
> >> parquet-cpp does not implement variant type yet, so it is safe to
> release
> >> the geo types. IIUC, there is no easy way to block users from producing
> >> files with variant types in parquet-java, so this is the main concern.
> >>
> >> Perhaps Aihua can provide an update on the progress?
> >>
> >> Best,
> >> Gang
> >>
> >>
> >>
> >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> Thank you for all your hard work on Parquet.
> >>>
> >>> Sorry for my ignorance, but I’d like to better understand why the
> Parquet
> >>> Java release for Geo types is currently tied to the Variant type work.
> >>> Arrow C++ (Parquet C++) has already been released with Geo type
> support,
> >>> and it doesn’t seem to have encountered similar issues.
> >>>
> >>> The Geo type support in Iceberg has been stalled for several months
> >> because
> >>> the Iceberg PMC cannot review or merge the implementation until
> there’s a
> >>> corresponding Parquet Java release.
> >>>
> >>> Would it be possible to proceed with a new Parquet Java release for
> Geo,
> >>> and mark the Variant type as experimental or keep it behind a feature
> >> flag?
> >>>
> >>> I’d really appreciate your thoughts on this and am looking forward to
> >> your
> >>> response.
> >>>
> >>> Thanks,
> >>> Jia
> >>>
> >>>
> >>>
> >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <aihu...@gmail.com> wrote:
> >>>
> >>>> Seems the concern from Gabor is that we should finalize the Variant
> >> spec
> >>> (
> >>>>
> >> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
> >>>> and
> >>>>
> >>
> https://github.com/apache/parquet-format/blob/master/VariantShredding.md
> >>> ),
> >>>> have a parquet-format release, and then move forward with parquet-java
> >>>> release. I totally agree.
> >>>>
> >>>> We should have met the requirement with two reference implementations
> >> for
> >>>> Variant in open source and I will start a VOTE thread separately to
> >> close
> >>>> out the Variant spec if no objections.
> >>>>
> >>>> Thanks for the discussions.
> >>>> Aihua
> >>>>
> >>>>
> >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <andrewlam...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>>> At this point, I’d like to check if we have enough implementation
> >>>>> coverage
> >>>>>> to move forward with finalizing the Variant spec. Would it make
> >> sense
> >>>> to
> >>>>>> start a vote thread at this stage?
> >>>>>
> >>>>> In my opinion we have sufficient open source implementations (the
> >>> Golang
> >>>>> implementation on arrow-go) and a vote to finalize the spec would be
> >>>>> appropriate (and welcome)
> >>>>>
> >>>>> From my experience working on the Rust implementation so far, I have
> >>>> found
> >>>>> the spec clear and easy to understand, the design well thought out,
> >> and
> >>>>> have not encountered anything that would require any changes.
> >>>>>
> >>>>> Kudos to the team who designed and wrote the spec for this feature,
> >>>>> Andrew
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <ji...@apache.org> wrote:
> >>>>>
> >>>>>> Thanks Aihua!
> >>>>>>
> >>>>>> The geo type implementation in Iceberg is currently blocked by this
> >>>>>> release. Really looking forward to it.
> >>>>>>
> >>>>>> Jia
> >>>>>>
> >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky <
> >> ga...@apache.org>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> My concern was related to the current stage of the Variant
> >>>>> specification
> >>>>>>> and the fact that we started talking about releasing parquet-java
> >>>> with
> >>>>>>> Variant features.
> >>>>>>> If we formally release parquet-format with the finalized Variant
> >>> spec
> >>>>>>> first, then I have no concerns about writing Variant values in
> >> the
> >>>>>> upcoming
> >>>>>>> parquet-java release. Otherwise, we need to block it by default
> >> and
> >>>>> mark
> >>>>>> it
> >>>>>>> as an experimental feature.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Gabor
> >>>>>>>
> >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl. 16.,
> >>> Sze,
> >>>>>>> 19:37):
> >>>>>>>
> >>>>>>>> Hi Gabor and all,
> >>>>>>>>
> >>>>>>>> Here’s my current understanding of the progress on the
> >> *Variant*
> >>>>>> support
> >>>>>>> in
> >>>>>>>> Parquet:
> >>>>>>>>
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   Per Parquet's requirements, we need at least two reference
> >>>>>>>>   implementations to finalize the Variant logical type
> >>>>> specification.
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>   The community is actively working on Java, Go, and Rust
> >>>>>>> implementations:
> >>>>>>>>   -
> >>>>>>>>
> >>>>>>>>      Java already has the encoding and shredding
> >> implementations
> >>>> in
> >>>>>>> place:
> >>>>>>>>      -
> >>>>>>>>
> >>>>>>>>         Variant Decoding <
> >>>>>>>> https://github.com/apache/parquet-java/pull/3197>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>         Variant Encoding <
> >>>>>>>> https://github.com/apache/parquet-java/pull/3202>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>         Variant Shredding Writer
> >>>>>>>>         <https://github.com/apache/parquet-java/issues/3223>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>         Variant Shredding Reader
> >>>>>>>>         <https://github.com/apache/parquet-java/issues/3211>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>      Go also includes encoding and shredding support:
> >>>>>>>>      -
> >>>>>>>>
> >>>>>>>>         Variant Encoding/Decoding
> >>>>>>>>         <https://github.com/apache/arrow-go/pull/344>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>         Variant Shredding <
> >>>>>> https://github.com/apache/arrow-go/pull/434>
> >>>>>>>>         -
> >>>>>>>>
> >>>>>>>>      Rust is currently working on the shredding
> >> implementation.
> >>>>>>>>
> >>>>>>>> In addition to these, we already have a full Variant
> >>> implementation
> >>>>> in
> >>>>>>>> Apache Iceberg, as well as in some closed-source engines.
> >>>>>>>>
> >>>>>>>> At this point, I’d like to check if we have enough
> >> implementation
> >>>>>>> coverage
> >>>>>>>> to move forward with finalizing the Variant spec. Would it make
> >>>> sense
> >>>>>> to
> >>>>>>>> start a vote thread at this stage?
> >>>>>>>>
> >>>>>>>> Ultimately, our goal is to release a new version of
> >>> parquet-format
> >>>>> and
> >>>>>>>> parquet-java that includes the Variant logical type, so that
> >>>> Iceberg
> >>>>>> and
> >>>>>>>> other engines can officially depend on it and proceed with
> >>> further
> >>>>>>>> implementation.
> >>>>>>>>
> >>>>>>>> Let me know your thoughts and how we should proceed.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Aihua
> >>>>>>>>
> >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky <
> >>>> ga...@apache.org>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I was not able to open the recordings of the last meeting
> >>> because
> >>>>> of
> >>>>>>>>> permission issues. (Shouldn't these be accessible for
> >> anyone?)
> >>>>>>>>> So, I'm not sure if you have talked about this, but the
> >> Variant
> >>>>> spec
> >>>>>> is
> >>>>>>>>> still not final. Since parquet-java already has Variant
> >>> support,
> >>>>> how
> >>>>>> do
> >>>>>>>> we
> >>>>>>>>> prevent writing potentially invalid Variant data with the
> >>> proper
> >>>>>>> logical
> >>>>>>>>> types we will use for the finalized spec? Is it behind a
> >>> feature
> >>>>>> flag?
> >>>>>>>>>
> >>>>>>>>> Cheers,
> >>>>>>>>> Gabor
> >>>>>>>>>
> >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl.
> >>> 11.,
> >>>> P,
> >>>>>>>> 19:33):
> >>>>>>>>>
> >>>>>>>>>> Hi community,
> >>>>>>>>>>
> >>>>>>>>>> As discussed in the last community sync-up meeting, I'd
> >> like
> >>> to
> >>>>>>> proceed
> >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include
> >>>> support
> >>>>>> for
> >>>>>>>>>> *geo-type* and *variant*.
> >>>>>>>>>>
> >>>>>>>>>> Please let me know if you have any objections or if you
> >> have
> >>>> any
> >>>>>>>> upcoming
> >>>>>>>>>> changes you'd like to include in this release.
> >>>>>>>>>> Thanks,
> >>>>>>>>>> Aihua
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>

Reply via email to