I agree with Gábor. Yesterday, a PR has been merged
<https://github.com/apache/parquet-format/commits/dac5a35040ab57000b84246746c5c9cb25267261/src/main/thrift>
that also touches the Thrift file. I think the release should be
pretty straightforward, and I'm happy to help out with both releases.

Kind regards,
Fokko

Op ma 25 aug 2025 om 09:00 schreef Gábor Szádovszky <ga...@apache.org>:

> I think it would be cleaner to have a parquet-format release with the
> finalized spec first. Referencing it in the parquet-java release would
> state clearly that it is (supposed to) working according to the finalized
> specification.
>
> Gabor
>
> Gang Wu <ust...@gmail.com> ezt írta (időpont: 2025. aug. 25., H, 4:48):
>
> > The vote [1] for finalizing variant spec has passed so it's time to
> revive
> > this discussion.
> >
> > I just checked all the commits [2] to parquet-format since the last
> release
> > and found
> > that there is no thrift definition change. All commits are about
> > clarification or fixing typos.
> > Should we skip the format release and directly jump to the parquet-java
> > release?
> >
> > [1] https://lists.apache.org/thread/mr2voh7twz2hql4y59x5c7o32kntmbvm
> > [2]
> >
> https://github.com/apache/parquet-format/commits/master/?since=2025-03-24
> >
> > Best,
> > Gang
> >
> >
> > On Wed, Aug 20, 2025 at 9:58 AM Gang Wu <ust...@gmail.com> wrote:
> >
> > > Thanks for the heads up!
> > >
> > > Yes, I think a formal vote is required before merging the PR.
> > >
> > > Best,
> > > Gang
> > >
> > > On Wed, Aug 20, 2025 at 12:36 AM Aihua Xu <aihu...@gmail.com> wrote:
> > >
> > >> Hi community,
> > >>
> > >> Let me know if a vote process is needed or we can review in
> > >> https://github.com/apache/parquet-format/pull/509 (which is to remove
> > the
> > >> under development lines).
> > >>
> > >> Thanks,
> > >> Aihua
> > >>
> > >> On Mon, Aug 18, 2025 at 10:53 AM Aihua Xu <aihu...@gmail.com> wrote:
> > >>
> > >> > Hi Micah and community,
> > >> >
> > >> > We’ve generated the test files from Go (PR #94
> > >> > <https://github.com/apache/parquet-testing/pull/94>) and
> successfully
> > >> > validated them in Parquet-Java (PR #3258
> > >> > <https://github.com/apache/parquet-java/pull/3258>). During
> testing,
> > we
> > >> > identified two minor issues in the Go generation:
> > >> >
> > >> >    1.
> > >> >
> > >> >    The spec version should be *1* instead of *0*.
> > >> >    2.
> > >> >
> > >> >    The Parquet TIME type should be TIME(isAdjustedToUTC=false,
> MICROS)
> > >> >    instead of TIME(isAdjustedToUTC=true, MICROS).
> > >> >
> > >> > These issues have already been addressed by Matt.
> > >> >
> > >> > Looking ahead, here’s what I propose for closing out the Variant
> > >> release:
> > >> >
> > >> >    1.
> > >> >
> > >> >    Start a vote to finalize the Variant spec (removing the two lines
> > >> >    under *active development*).
> > >> >    2.
> > >> >
> > >> >    Start a vote for the Parquet-Java 1.16.0 release.
> > >> >
> > >> > Please share your thoughts on these next steps, or let me know if
> you
> > >> see
> > >> > anything else we should address before proceeding.
> > >> >
> > >> > Thanks,
> > >> > Aihua
> > >> >
> > >> > On Sun, Aug 17, 2025 at 9:28 PM Micah Kornfield <
> > emkornfi...@gmail.com>
> > >> > wrote:
> > >> >
> > >> >> >
> > >> >> > You want to see if the write path in GO is compatible? Let
> > >> >> > me check with Matt on this.
> > >> >>
> > >> >>
> > >> >> Yes, IIUC, I think there are now multiple OSS reader
> implementations,
> > >> that
> > >> >> have all been validated against parquet-java writing.  So I think
> it
> > is
> > >> >> important we validate a second writer can produce files that can be
> > >> read
> > >> >> by
> > >> >> parquet-java.
> > >> >>
> > >> >> Thanks,
> > >> >> Micah
> > >> >>
> > >> >> On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <aihu...@gmail.com>
> wrote:
> > >> >>
> > >> >> > Hi Micah,
> > >> >> >
> > >> >> > What we have done is to generate a large set of the test cases
> from
> > >> the
> > >> >> > Iceberg project and validate in Java and GO. All of those
> > >> >> implementations
> > >> >> > are independent. You want to see if the write path in GO is
> > >> compatible?
> > >> >> Let
> > >> >> > me check with Matt on this.
> > >> >> >
> > >> >> > Thanks,
> > >> >> > Aihua
> > >> >> >
> > >> >> > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield <
> > >> emkornfi...@gmail.com>
> > >> >> > wrote:
> > >> >> >
> > >> >> > > >
> > >> >> > > > We have completed cross-language validation for variant and
> the
> > >> >> > > > implementation compatibility appears solid
> > >> >> > >
> > >> >> > >
> > >> >> > > Great, apologies if I missed it but did we verify Java being
> able
> > >> to
> > >> >> read
> > >> >> > > Go's output?
> > >> >> > >
> > >> >> > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com>
> > wrote:
> > >> >> > >
> > >> >> > > > We have completed cross-language validation for variant and
> the
> > >> >> > > > implementation compatibility appears solid. Matt has raised
> > some
> > >> >> > comments
> > >> >> > > > regarding how to handle invalid cases. In fact, we had a long
> > >> >> > discussion
> > >> >> > > > during the spec development about whether to explicitly
> define
> > >> the
> > >> >> > > behavior
> > >> >> > > > for such cases. We should be able to clear that out soon.
> > >> >> > > >
> > >> >> > > >
> > >> >> > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org>
> wrote:
> > >> >> > > > >
> > >> >> > > > > Hi Gang,
> > >> >> > > > >
> > >> >> > > > > Thanks for letting me know.
> > >> >> > > > >
> > >> >> > > > > Would it make sense to create a new Parquet Java branch
> that
> > >> >> includes
> > >> >> > > all
> > >> >> > > > > other commits except the Variant type implementation? That
> > >> way, we
> > >> >> > > could
> > >> >> > > > > release a version without Variant entirely.
> > >> >> > > > >
> > >> >> > > > > We’re eager to get the Geo type released, but at the same
> > >> time, we
> > >> >> > > don’t
> > >> >> > > > > want to rush the Variant work or ship something that’s not
> > >> fully
> > >> >> > ready.
> > >> >> > > > >
> > >> >> > > > > Thanks,
> > >> >> > > > > Jia
> > >> >> > > > >
> > >> >> > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com>
> > >> wrote:
> > >> >> > > > >>
> > >> >> > > > >> parquet-cpp does not implement variant type yet, so it is
> > >> safe to
> > >> >> > > > release
> > >> >> > > > >> the geo types. IIUC, there is no easy way to block users
> > from
> > >> >> > > producing
> > >> >> > > > >> files with variant types in parquet-java, so this is the
> > main
> > >> >> > concern.
> > >> >> > > > >>
> > >> >> > > > >> Perhaps Aihua can provide an update on the progress?
> > >> >> > > > >>
> > >> >> > > > >> Best,
> > >> >> > > > >> Gang
> > >> >> > > > >>
> > >> >> > > > >>
> > >> >> > > > >>
> > >> >> > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org>
> > >> wrote:
> > >> >> > > > >>>
> > >> >> > > > >>> Hi all,
> > >> >> > > > >>>
> > >> >> > > > >>> Thank you for all your hard work on Parquet.
> > >> >> > > > >>>
> > >> >> > > > >>> Sorry for my ignorance, but I’d like to better understand
> > why
> > >> >> the
> > >> >> > > > Parquet
> > >> >> > > > >>> Java release for Geo types is currently tied to the
> Variant
> > >> type
> > >> >> > > work.
> > >> >> > > > >>> Arrow C++ (Parquet C++) has already been released with
> Geo
> > >> type
> > >> >> > > > support,
> > >> >> > > > >>> and it doesn’t seem to have encountered similar issues.
> > >> >> > > > >>>
> > >> >> > > > >>> The Geo type support in Iceberg has been stalled for
> > several
> > >> >> months
> > >> >> > > > >> because
> > >> >> > > > >>> the Iceberg PMC cannot review or merge the implementation
> > >> until
> > >> >> > > > there’s a
> > >> >> > > > >>> corresponding Parquet Java release.
> > >> >> > > > >>>
> > >> >> > > > >>> Would it be possible to proceed with a new Parquet Java
> > >> release
> > >> >> for
> > >> >> > > > Geo,
> > >> >> > > > >>> and mark the Variant type as experimental or keep it
> > behind a
> > >> >> > feature
> > >> >> > > > >> flag?
> > >> >> > > > >>>
> > >> >> > > > >>> I’d really appreciate your thoughts on this and am
> looking
> > >> >> forward
> > >> >> > to
> > >> >> > > > >> your
> > >> >> > > > >>> response.
> > >> >> > > > >>>
> > >> >> > > > >>> Thanks,
> > >> >> > > > >>> Jia
> > >> >> > > > >>>
> > >> >> > > > >>>
> > >> >> > > > >>>
> > >> >> > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <
> > >> aihu...@gmail.com>
> > >> >> > > wrote:
> > >> >> > > > >>>
> > >> >> > > > >>>> Seems the concern from Gabor is that we should finalize
> > the
> > >> >> > Variant
> > >> >> > > > >> spec
> > >> >> > > > >>> (
> > >> >> > > > >>>>
> > >> >> > > > >>
> > >> >> > >
> > >> >>
> > >>
> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
> > >> >> > > > >>>> and
> > >> >> > > > >>>>
> > >> >> > > > >>
> > >> >> > > >
> > >> >> >
> > >> >>
> > >>
> > https://github.com/apache/parquet-format/blob/master/VariantShredding.md
> > >> >> > > > >>> ),
> > >> >> > > > >>>> have a parquet-format release, and then move forward
> with
> > >> >> > > parquet-java
> > >> >> > > > >>>> release. I totally agree.
> > >> >> > > > >>>>
> > >> >> > > > >>>> We should have met the requirement with two reference
> > >> >> > > implementations
> > >> >> > > > >> for
> > >> >> > > > >>>> Variant in open source and I will start a VOTE thread
> > >> >> separately
> > >> >> > to
> > >> >> > > > >> close
> > >> >> > > > >>>> out the Variant spec if no objections.
> > >> >> > > > >>>>
> > >> >> > > > >>>> Thanks for the discussions.
> > >> >> > > > >>>> Aihua
> > >> >> > > > >>>>
> > >> >> > > > >>>>
> > >> >> > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <
> > >> >> > andrewlam...@gmail.com
> > >> >> > > >
> > >> >> > > > >>>> wrote:
> > >> >> > > > >>>>
> > >> >> > > > >>>>>> At this point, I’d like to check if we have enough
> > >> >> > implementation
> > >> >> > > > >>>>> coverage
> > >> >> > > > >>>>>> to move forward with finalizing the Variant spec.
> Would
> > it
> > >> >> make
> > >> >> > > > >> sense
> > >> >> > > > >>>> to
> > >> >> > > > >>>>>> start a vote thread at this stage?
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> In my opinion we have sufficient open source
> > >> implementations
> > >> >> (the
> > >> >> > > > >>> Golang
> > >> >> > > > >>>>> implementation on arrow-go) and a vote to finalize the
> > spec
> > >> >> would
> > >> >> > > be
> > >> >> > > > >>>>> appropriate (and welcome)
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> From my experience working on the Rust implementation
> so
> > >> far,
> > >> >> I
> > >> >> > > have
> > >> >> > > > >>>> found
> > >> >> > > > >>>>> the spec clear and easy to understand, the design well
> > >> thought
> > >> >> > out,
> > >> >> > > > >> and
> > >> >> > > > >>>>> have not encountered anything that would require any
> > >> changes.
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> Kudos to the team who designed and wrote the spec for
> > this
> > >> >> > feature,
> > >> >> > > > >>>>> Andrew
> > >> >> > > > >>>>>
> > >> >> > > > >>>>>
> > >> >> > > > >>>>>
> > >> >> > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <
> ji...@apache.org
> > >
> > >> >> wrote:
> > >> >> > > > >>>>>
> > >> >> > > > >>>>>> Thanks Aihua!
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> The geo type implementation in Iceberg is currently
> > >> blocked
> > >> >> by
> > >> >> > > this
> > >> >> > > > >>>>>> release. Really looking forward to it.
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> Jia
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky <
> > >> >> > > > >> ga...@apache.org>
> > >> >> > > > >>>>>> wrote:
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>>> My concern was related to the current stage of the
> > >> Variant
> > >> >> > > > >>>>> specification
> > >> >> > > > >>>>>>> and the fact that we started talking about releasing
> > >> >> > parquet-java
> > >> >> > > > >>>> with
> > >> >> > > > >>>>>>> Variant features.
> > >> >> > > > >>>>>>> If we formally release parquet-format with the
> > finalized
> > >> >> > Variant
> > >> >> > > > >>> spec
> > >> >> > > > >>>>>>> first, then I have no concerns about writing Variant
> > >> values
> > >> >> in
> > >> >> > > > >> the
> > >> >> > > > >>>>>> upcoming
> > >> >> > > > >>>>>>> parquet-java release. Otherwise, we need to block it
> by
> > >> >> default
> > >> >> > > > >> and
> > >> >> > > > >>>>> mark
> > >> >> > > > >>>>>> it
> > >> >> > > > >>>>>>> as an experimental feature.
> > >> >> > > > >>>>>>>
> > >> >> > > > >>>>>>> Cheers,
> > >> >> > > > >>>>>>> Gabor
> > >> >> > > > >>>>>>>
> > >> >> > > > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont:
> 2025.
> > >> júl.
> > >> >> > 16.,
> > >> >> > > > >>> Sze,
> > >> >> > > > >>>>>>> 19:37):
> > >> >> > > > >>>>>>>
> > >> >> > > > >>>>>>>> Hi Gabor and all,
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> Here’s my current understanding of the progress on
> the
> > >> >> > > > >> *Variant*
> > >> >> > > > >>>>>> support
> > >> >> > > > >>>>>>> in
> > >> >> > > > >>>>>>>> Parquet:
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>   -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>   Per Parquet's requirements, we need at least two
> > >> >> reference
> > >> >> > > > >>>>>>>>   implementations to finalize the Variant logical
> type
> > >> >> > > > >>>>> specification.
> > >> >> > > > >>>>>>>>   -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>   The community is actively working on Java, Go, and
> > >> Rust
> > >> >> > > > >>>>>>> implementations:
> > >> >> > > > >>>>>>>>   -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>      Java already has the encoding and shredding
> > >> >> > > > >> implementations
> > >> >> > > > >>>> in
> > >> >> > > > >>>>>>> place:
> > >> >> > > > >>>>>>>>      -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Decoding <
> > >> >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Encoding <
> > >> >> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Shredding Writer
> > >> >> > > > >>>>>>>>         <
> > >> >> https://github.com/apache/parquet-java/issues/3223>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Shredding Reader
> > >> >> > > > >>>>>>>>         <
> > >> >> https://github.com/apache/parquet-java/issues/3211>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>      Go also includes encoding and shredding
> support:
> > >> >> > > > >>>>>>>>      -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Encoding/Decoding
> > >> >> > > > >>>>>>>>         <
> https://github.com/apache/arrow-go/pull/344>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>         Variant Shredding <
> > >> >> > > > >>>>>> https://github.com/apache/arrow-go/pull/434>
> > >> >> > > > >>>>>>>>         -
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>      Rust is currently working on the shredding
> > >> >> > > > >> implementation.
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> In addition to these, we already have a full Variant
> > >> >> > > > >>> implementation
> > >> >> > > > >>>>> in
> > >> >> > > > >>>>>>>> Apache Iceberg, as well as in some closed-source
> > >> engines.
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> At this point, I’d like to check if we have enough
> > >> >> > > > >> implementation
> > >> >> > > > >>>>>>> coverage
> > >> >> > > > >>>>>>>> to move forward with finalizing the Variant spec.
> > Would
> > >> it
> > >> >> > make
> > >> >> > > > >>>> sense
> > >> >> > > > >>>>>> to
> > >> >> > > > >>>>>>>> start a vote thread at this stage?
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> Ultimately, our goal is to release a new version of
> > >> >> > > > >>> parquet-format
> > >> >> > > > >>>>> and
> > >> >> > > > >>>>>>>> parquet-java that includes the Variant logical type,
> > so
> > >> >> that
> > >> >> > > > >>>> Iceberg
> > >> >> > > > >>>>>> and
> > >> >> > > > >>>>>>>> other engines can officially depend on it and
> proceed
> > >> with
> > >> >> > > > >>> further
> > >> >> > > > >>>>>>>> implementation.
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> Let me know your thoughts and how we should proceed.
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> Thanks,
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> Aihua
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky <
> > >> >> > > > >>>> ga...@apache.org>
> > >> >> > > > >>>>>>>> wrote:
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>>> Hi,
> > >> >> > > > >>>>>>>>>
> > >> >> > > > >>>>>>>>> I was not able to open the recordings of the last
> > >> meeting
> > >> >> > > > >>> because
> > >> >> > > > >>>>> of
> > >> >> > > > >>>>>>>>> permission issues. (Shouldn't these be accessible
> for
> > >> >> > > > >> anyone?)
> > >> >> > > > >>>>>>>>> So, I'm not sure if you have talked about this, but
> > the
> > >> >> > > > >> Variant
> > >> >> > > > >>>>> spec
> > >> >> > > > >>>>>> is
> > >> >> > > > >>>>>>>>> still not final. Since parquet-java already has
> > Variant
> > >> >> > > > >>> support,
> > >> >> > > > >>>>> how
> > >> >> > > > >>>>>> do
> > >> >> > > > >>>>>>>> we
> > >> >> > > > >>>>>>>>> prevent writing potentially invalid Variant data
> with
> > >> the
> > >> >> > > > >>> proper
> > >> >> > > > >>>>>>> logical
> > >> >> > > > >>>>>>>>> types we will use for the finalized spec? Is it
> > behind
> > >> a
> > >> >> > > > >>> feature
> > >> >> > > > >>>>>> flag?
> > >> >> > > > >>>>>>>>>
> > >> >> > > > >>>>>>>>> Cheers,
> > >> >> > > > >>>>>>>>> Gabor
> > >> >> > > > >>>>>>>>>
> > >> >> > > > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont:
> > 2025.
> > >> >> júl.
> > >> >> > > > >>> 11.,
> > >> >> > > > >>>> P,
> > >> >> > > > >>>>>>>> 19:33):
> > >> >> > > > >>>>>>>>>
> > >> >> > > > >>>>>>>>>> Hi community,
> > >> >> > > > >>>>>>>>>>
> > >> >> > > > >>>>>>>>>> As discussed in the last community sync-up
> meeting,
> > >> I'd
> > >> >> > > > >> like
> > >> >> > > > >>> to
> > >> >> > > > >>>>>>> proceed
> > >> >> > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will
> > >> include
> > >> >> > > > >>>> support
> > >> >> > > > >>>>>> for
> > >> >> > > > >>>>>>>>>> *geo-type* and *variant*.
> > >> >> > > > >>>>>>>>>>
> > >> >> > > > >>>>>>>>>> Please let me know if you have any objections or
> if
> > >> you
> > >> >> > > > >> have
> > >> >> > > > >>>> any
> > >> >> > > > >>>>>>>> upcoming
> > >> >> > > > >>>>>>>>>> changes you'd like to include in this release.
> > >> >> > > > >>>>>>>>>> Thanks,
> > >> >> > > > >>>>>>>>>> Aihua
> > >> >> > > > >>>>>>>>>>
> > >> >> > > > >>>>>>>>>
> > >> >> > > > >>>>>>>>
> > >> >> > > > >>>>>>>
> > >> >> > > > >>>>>>
> > >> >> > > > >>>>>
> > >> >> > > > >>>>
> > >> >> > > > >>>
> > >> >> > > > >>
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >> >
> > >>
> > >
> >
>

Reply via email to