Hi community,

Let me know if a vote process is needed or we can review in
https://github.com/apache/parquet-format/pull/509 (which is to remove the
under development lines).

Thanks,
Aihua

On Mon, Aug 18, 2025 at 10:53 AM Aihua Xu <aihu...@gmail.com> wrote:

> Hi Micah and community,
>
> We’ve generated the test files from Go (PR #94
> <https://github.com/apache/parquet-testing/pull/94>) and successfully
> validated them in Parquet-Java (PR #3258
> <https://github.com/apache/parquet-java/pull/3258>). During testing, we
> identified two minor issues in the Go generation:
>
>    1.
>
>    The spec version should be *1* instead of *0*.
>    2.
>
>    The Parquet TIME type should be TIME(isAdjustedToUTC=false, MICROS)
>    instead of TIME(isAdjustedToUTC=true, MICROS).
>
> These issues have already been addressed by Matt.
>
> Looking ahead, here’s what I propose for closing out the Variant release:
>
>    1.
>
>    Start a vote to finalize the Variant spec (removing the two lines
>    under *active development*).
>    2.
>
>    Start a vote for the Parquet-Java 1.16.0 release.
>
> Please share your thoughts on these next steps, or let me know if you see
> anything else we should address before proceeding.
>
> Thanks,
> Aihua
>
> On Sun, Aug 17, 2025 at 9:28 PM Micah Kornfield <emkornfi...@gmail.com>
> wrote:
>
>> >
>> > You want to see if the write path in GO is compatible? Let
>> > me check with Matt on this.
>>
>>
>> Yes, IIUC, I think there are now multiple OSS reader implementations, that
>> have all been validated against parquet-java writing.  So I think it is
>> important we validate a second writer can produce files that can be read
>> by
>> parquet-java.
>>
>> Thanks,
>> Micah
>>
>> On Mon, Aug 11, 2025 at 9:17 AM Aihua Xu <aihu...@gmail.com> wrote:
>>
>> > Hi Micah,
>> >
>> > What we have done is to generate a large set of the test cases from the
>> > Iceberg project and validate in Java and GO. All of those
>> implementations
>> > are independent. You want to see if the write path in GO is compatible?
>> Let
>> > me check with Matt on this.
>> >
>> > Thanks,
>> > Aihua
>> >
>> > On Sun, Aug 10, 2025 at 9:24 PM Micah Kornfield <emkornfi...@gmail.com>
>> > wrote:
>> >
>> > > >
>> > > > We have completed cross-language validation for variant and the
>> > > > implementation compatibility appears solid
>> > >
>> > >
>> > > Great, apologies if I missed it but did we verify Java being able to
>> read
>> > > Go's output?
>> > >
>> > > On Fri, Aug 8, 2025 at 9:38 PM Aihua Xu <aihu...@gmail.com> wrote:
>> > >
>> > > > We have completed cross-language validation for variant and the
>> > > > implementation compatibility appears solid. Matt has raised some
>> > comments
>> > > > regarding how to handle invalid cases. In fact, we had a long
>> > discussion
>> > > > during the spec development about whether to explicitly define the
>> > > behavior
>> > > > for such cases. We should be able to clear that out soon.
>> > > >
>> > > >
>> > > > > On Aug 8, 2025, at 2:35 PM, Jia Yu <ji...@apache.org> wrote:
>> > > > >
>> > > > > Hi Gang,
>> > > > >
>> > > > > Thanks for letting me know.
>> > > > >
>> > > > > Would it make sense to create a new Parquet Java branch that
>> includes
>> > > all
>> > > > > other commits except the Variant type implementation? That way, we
>> > > could
>> > > > > release a version without Variant entirely.
>> > > > >
>> > > > > We’re eager to get the Geo type released, but at the same time, we
>> > > don’t
>> > > > > want to rush the Variant work or ship something that’s not fully
>> > ready.
>> > > > >
>> > > > > Thanks,
>> > > > > Jia
>> > > > >
>> > > > >> On Fri, Aug 8, 2025 at 1:25 AM Gang Wu <ust...@gmail.com> wrote:
>> > > > >>
>> > > > >> parquet-cpp does not implement variant type yet, so it is safe to
>> > > > release
>> > > > >> the geo types. IIUC, there is no easy way to block users from
>> > > producing
>> > > > >> files with variant types in parquet-java, so this is the main
>> > concern.
>> > > > >>
>> > > > >> Perhaps Aihua can provide an update on the progress?
>> > > > >>
>> > > > >> Best,
>> > > > >> Gang
>> > > > >>
>> > > > >>
>> > > > >>
>> > > > >>> On Fri, Aug 8, 2025 at 5:11 AM Jia Yu <ji...@apache.org> wrote:
>> > > > >>>
>> > > > >>> Hi all,
>> > > > >>>
>> > > > >>> Thank you for all your hard work on Parquet.
>> > > > >>>
>> > > > >>> Sorry for my ignorance, but I’d like to better understand why
>> the
>> > > > Parquet
>> > > > >>> Java release for Geo types is currently tied to the Variant type
>> > > work.
>> > > > >>> Arrow C++ (Parquet C++) has already been released with Geo type
>> > > > support,
>> > > > >>> and it doesn’t seem to have encountered similar issues.
>> > > > >>>
>> > > > >>> The Geo type support in Iceberg has been stalled for several
>> months
>> > > > >> because
>> > > > >>> the Iceberg PMC cannot review or merge the implementation until
>> > > > there’s a
>> > > > >>> corresponding Parquet Java release.
>> > > > >>>
>> > > > >>> Would it be possible to proceed with a new Parquet Java release
>> for
>> > > > Geo,
>> > > > >>> and mark the Variant type as experimental or keep it behind a
>> > feature
>> > > > >> flag?
>> > > > >>>
>> > > > >>> I’d really appreciate your thoughts on this and am looking
>> forward
>> > to
>> > > > >> your
>> > > > >>> response.
>> > > > >>>
>> > > > >>> Thanks,
>> > > > >>> Jia
>> > > > >>>
>> > > > >>>
>> > > > >>>
>> > > > >>>> On Fri, Jul 18, 2025 at 10:33 AM Aihua Xu <aihu...@gmail.com>
>> > > wrote:
>> > > > >>>
>> > > > >>>> Seems the concern from Gabor is that we should finalize the
>> > Variant
>> > > > >> spec
>> > > > >>> (
>> > > > >>>>
>> > > > >>
>> > >
>> https://github.com/apache/parquet-format/blob/master/VariantEncoding.md
>> > > > >>>> and
>> > > > >>>>
>> > > > >>
>> > > >
>> >
>> https://github.com/apache/parquet-format/blob/master/VariantShredding.md
>> > > > >>> ),
>> > > > >>>> have a parquet-format release, and then move forward with
>> > > parquet-java
>> > > > >>>> release. I totally agree.
>> > > > >>>>
>> > > > >>>> We should have met the requirement with two reference
>> > > implementations
>> > > > >> for
>> > > > >>>> Variant in open source and I will start a VOTE thread
>> separately
>> > to
>> > > > >> close
>> > > > >>>> out the Variant spec if no objections.
>> > > > >>>>
>> > > > >>>> Thanks for the discussions.
>> > > > >>>> Aihua
>> > > > >>>>
>> > > > >>>>
>> > > > >>>> On Thu, Jul 17, 2025 at 3:41 AM Andrew Lamb <
>> > andrewlam...@gmail.com
>> > > >
>> > > > >>>> wrote:
>> > > > >>>>
>> > > > >>>>>> At this point, I’d like to check if we have enough
>> > implementation
>> > > > >>>>> coverage
>> > > > >>>>>> to move forward with finalizing the Variant spec. Would it
>> make
>> > > > >> sense
>> > > > >>>> to
>> > > > >>>>>> start a vote thread at this stage?
>> > > > >>>>>
>> > > > >>>>> In my opinion we have sufficient open source implementations
>> (the
>> > > > >>> Golang
>> > > > >>>>> implementation on arrow-go) and a vote to finalize the spec
>> would
>> > > be
>> > > > >>>>> appropriate (and welcome)
>> > > > >>>>>
>> > > > >>>>> From my experience working on the Rust implementation so far,
>> I
>> > > have
>> > > > >>>> found
>> > > > >>>>> the spec clear and easy to understand, the design well thought
>> > out,
>> > > > >> and
>> > > > >>>>> have not encountered anything that would require any changes.
>> > > > >>>>>
>> > > > >>>>> Kudos to the team who designed and wrote the spec for this
>> > feature,
>> > > > >>>>> Andrew
>> > > > >>>>>
>> > > > >>>>>
>> > > > >>>>>
>> > > > >>>>> On Thu, Jul 17, 2025 at 2:08 AM Jia Yu <ji...@apache.org>
>> wrote:
>> > > > >>>>>
>> > > > >>>>>> Thanks Aihua!
>> > > > >>>>>>
>> > > > >>>>>> The geo type implementation in Iceberg is currently blocked
>> by
>> > > this
>> > > > >>>>>> release. Really looking forward to it.
>> > > > >>>>>>
>> > > > >>>>>> Jia
>> > > > >>>>>>
>> > > > >>>>>> On Wed, Jul 16, 2025 at 10:47 PM Gábor Szádovszky <
>> > > > >> ga...@apache.org>
>> > > > >>>>>> wrote:
>> > > > >>>>>>
>> > > > >>>>>>> My concern was related to the current stage of the Variant
>> > > > >>>>> specification
>> > > > >>>>>>> and the fact that we started talking about releasing
>> > parquet-java
>> > > > >>>> with
>> > > > >>>>>>> Variant features.
>> > > > >>>>>>> If we formally release parquet-format with the finalized
>> > Variant
>> > > > >>> spec
>> > > > >>>>>>> first, then I have no concerns about writing Variant values
>> in
>> > > > >> the
>> > > > >>>>>> upcoming
>> > > > >>>>>>> parquet-java release. Otherwise, we need to block it by
>> default
>> > > > >> and
>> > > > >>>>> mark
>> > > > >>>>>> it
>> > > > >>>>>>> as an experimental feature.
>> > > > >>>>>>>
>> > > > >>>>>>> Cheers,
>> > > > >>>>>>> Gabor
>> > > > >>>>>>>
>> > > > >>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025. júl.
>> > 16.,
>> > > > >>> Sze,
>> > > > >>>>>>> 19:37):
>> > > > >>>>>>>
>> > > > >>>>>>>> Hi Gabor and all,
>> > > > >>>>>>>>
>> > > > >>>>>>>> Here’s my current understanding of the progress on the
>> > > > >> *Variant*
>> > > > >>>>>> support
>> > > > >>>>>>> in
>> > > > >>>>>>>> Parquet:
>> > > > >>>>>>>>
>> > > > >>>>>>>>   -
>> > > > >>>>>>>>
>> > > > >>>>>>>>   Per Parquet's requirements, we need at least two
>> reference
>> > > > >>>>>>>>   implementations to finalize the Variant logical type
>> > > > >>>>> specification.
>> > > > >>>>>>>>   -
>> > > > >>>>>>>>
>> > > > >>>>>>>>   The community is actively working on Java, Go, and Rust
>> > > > >>>>>>> implementations:
>> > > > >>>>>>>>   -
>> > > > >>>>>>>>
>> > > > >>>>>>>>      Java already has the encoding and shredding
>> > > > >> implementations
>> > > > >>>> in
>> > > > >>>>>>> place:
>> > > > >>>>>>>>      -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Decoding <
>> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3197>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Encoding <
>> > > > >>>>>>>> https://github.com/apache/parquet-java/pull/3202>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Shredding Writer
>> > > > >>>>>>>>         <
>> https://github.com/apache/parquet-java/issues/3223>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Shredding Reader
>> > > > >>>>>>>>         <
>> https://github.com/apache/parquet-java/issues/3211>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>      Go also includes encoding and shredding support:
>> > > > >>>>>>>>      -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Encoding/Decoding
>> > > > >>>>>>>>         <https://github.com/apache/arrow-go/pull/344>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>         Variant Shredding <
>> > > > >>>>>> https://github.com/apache/arrow-go/pull/434>
>> > > > >>>>>>>>         -
>> > > > >>>>>>>>
>> > > > >>>>>>>>      Rust is currently working on the shredding
>> > > > >> implementation.
>> > > > >>>>>>>>
>> > > > >>>>>>>> In addition to these, we already have a full Variant
>> > > > >>> implementation
>> > > > >>>>> in
>> > > > >>>>>>>> Apache Iceberg, as well as in some closed-source engines.
>> > > > >>>>>>>>
>> > > > >>>>>>>> At this point, I’d like to check if we have enough
>> > > > >> implementation
>> > > > >>>>>>> coverage
>> > > > >>>>>>>> to move forward with finalizing the Variant spec. Would it
>> > make
>> > > > >>>> sense
>> > > > >>>>>> to
>> > > > >>>>>>>> start a vote thread at this stage?
>> > > > >>>>>>>>
>> > > > >>>>>>>> Ultimately, our goal is to release a new version of
>> > > > >>> parquet-format
>> > > > >>>>> and
>> > > > >>>>>>>> parquet-java that includes the Variant logical type, so
>> that
>> > > > >>>> Iceberg
>> > > > >>>>>> and
>> > > > >>>>>>>> other engines can officially depend on it and proceed with
>> > > > >>> further
>> > > > >>>>>>>> implementation.
>> > > > >>>>>>>>
>> > > > >>>>>>>> Let me know your thoughts and how we should proceed.
>> > > > >>>>>>>>
>> > > > >>>>>>>> Thanks,
>> > > > >>>>>>>>
>> > > > >>>>>>>> Aihua
>> > > > >>>>>>>>
>> > > > >>>>>>>> On Sun, Jul 13, 2025 at 10:08 PM Gábor Szádovszky <
>> > > > >>>> ga...@apache.org>
>> > > > >>>>>>>> wrote:
>> > > > >>>>>>>>
>> > > > >>>>>>>>> Hi,
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> I was not able to open the recordings of the last meeting
>> > > > >>> because
>> > > > >>>>> of
>> > > > >>>>>>>>> permission issues. (Shouldn't these be accessible for
>> > > > >> anyone?)
>> > > > >>>>>>>>> So, I'm not sure if you have talked about this, but the
>> > > > >> Variant
>> > > > >>>>> spec
>> > > > >>>>>> is
>> > > > >>>>>>>>> still not final. Since parquet-java already has Variant
>> > > > >>> support,
>> > > > >>>>> how
>> > > > >>>>>> do
>> > > > >>>>>>>> we
>> > > > >>>>>>>>> prevent writing potentially invalid Variant data with the
>> > > > >>> proper
>> > > > >>>>>>> logical
>> > > > >>>>>>>>> types we will use for the finalized spec? Is it behind a
>> > > > >>> feature
>> > > > >>>>>> flag?
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Cheers,
>> > > > >>>>>>>>> Gabor
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Aihua Xu <aihu...@gmail.com> ezt írta (időpont: 2025.
>> júl.
>> > > > >>> 11.,
>> > > > >>>> P,
>> > > > >>>>>>>> 19:33):
>> > > > >>>>>>>>>
>> > > > >>>>>>>>>> Hi community,
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> As discussed in the last community sync-up meeting, I'd
>> > > > >> like
>> > > > >>> to
>> > > > >>>>>>> proceed
>> > > > >>>>>>>>>> with releasing *Parquet-Java 1.16.0*, which will include
>> > > > >>>> support
>> > > > >>>>>> for
>> > > > >>>>>>>>>> *geo-type* and *variant*.
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Please let me know if you have any objections or if you
>> > > > >> have
>> > > > >>>> any
>> > > > >>>>>>>> upcoming
>> > > > >>>>>>>>>> changes you'd like to include in this release.
>> > > > >>>>>>>>>> Thanks,
>> > > > >>>>>>>>>> Aihua
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>
>> > > > >>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>
>> > > > >>>>>
>> > > > >>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> >
>>
>

Reply via email to