@Aihua Xu <aihu...@gmail.com> @RyanBlue it appears that the parquet files
in the indicated PR (https://github.com/apache/parquet-testing/pull/90/files)
do not seem to currently have the `Variant` logical type specified in their
schema. Is this intentional? The Go implementation won't (and shouldn't)
recognize the files as containing variant values without the logical type
being used, instead they are simply read as structs. Should this be updated?

Should I write something to have Go generate the files locally to test
against?

--Matt

On Sun, Jul 27, 2025 at 5:45 PM Aihua Xu <aihu...@gmail.com> wrote:

> Thanks Matt.
>
>
> > On Jul 27, 2025, at 2:31 PM, Matt Topol <zotthewiz...@gmail.com> wrote:
> >
> > I'll work this week on getting the Go implementation to use the same
> > testing files and ensure compatibility.
> >
> >> On Sun, Jul 27, 2025, 5:28 PM Aihua Xu <aihu...@gmail.com> wrote:
> >>
> >> Hi all,
> >>
> >> Following up on the test effort to validate the compatibility of the
> >> Variant implementation:
> >>
> >> Ryan has contributed test cases
> >> <https://github.com/apache/parquet-testing/pull/90/files> from Iceberg
> >> (see PR
> >> #13654 <https://github.com/apache/iceberg/pull/13654>), which I used to
> >> verify <https://github.com/apache/parquet-java/pull/3258/> the Variant
> >> implementation in Parquet-Java. The validation surfaced a few minor
> issues,
> >> but overall the results confirm compatibility between the two
> >> implementations.
> >>
> >> Let me know if you have any questions or additional follow-up requests.
> >>
> >> Thanks,
> >>
> >> Aihua
> >>
> >>
> >>
> >> On Wed, Jul 23, 2025 at 2:24 AM Andrew Lamb <andrewlam...@gmail.com>
> >> wrote:
> >>
> >>> I agree the parquet-testing repo should have example Parquet files
> >> storing
> >>> variants.
> >>>
> >>> It was brought to my attention recently that the duckdb folks made some
> >>> testing files[1] based on the Iceberg test suite.
> >>>
> >>> Perhaps we can add those files to parquet-testing as part of [2].
> >>>
> >>> I expect we'll get to testing the Rust shredding implementation in 2-3
> >>> weeks at which time I will likely help try and push this forward. It
> >> would
> >>> be great if someone else wanted to help do it beforehand.
> >>>
> >>> Andrew
> >>>
> >>> [1]: https://github.com/duckdb/duckdb/pull/18224
> >>> [2]: https://github.com/apache/parquet-testing/issues/75
> >>>
> >>>> On Wed, Jul 23, 2025 at 1:14 AM Gang Wu <ust...@gmail.com> wrote:
> >>>
> >>>> I was under the impression that parquet-testing does not yet have
> >> Parquet
> >>>> files with variant type annotations.
> >>>>
> >>>> Is this still the case? If not, should we add some (shredded and
> >>>> unshredded) files produced by Java and Go implementations?
> >>>>
> >>>> On Wed, Jul 23, 2025 at 3:18 AM Aihua Xu <aihu...@gmail.com> wrote:
> >>>>
> >>>>> Thanks Matt for the comment and working on the GO variant.
> >>>>>
> >>>>> Micah, that’s a good point. Let me check out the coverage
> >> completeness
> >>>> for
> >>>>> these two implementations.
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Jul 22, 2025, at 10:01 AM, Matt Topol <zotthewiz...@gmail.com>
> >>>> wrote:
> >>>>>>
> >>>>>> Assuming that the files with variants in
> >>>>>> https://github.com/apache/parquet-testing are generated by
> >>>> parquet-java,
> >>>>>> then we at least have confirmed that the Go implementation is able
> >> to
> >>>>> read
> >>>>>> variant files that are written by the Java implementation. So
> >> there's
> >>>> at
> >>>>>> least some testing of the two implementations against each other.
> >>>>>>
> >>>>>> --Matt
> >>>>>>
> >>>>>>> On Tue, Jul 22, 2025 at 12:29 AM Micah Kornfield <
> >>>> emkornfi...@gmail.com
> >>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> Have we tested the two implementations against one another?
> >>>>>>>
> >>>>>>>> On Mon, Jul 21, 2025 at 9:14 PM Aihua Xu <aihu...@gmail.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> Hi community,
> >>>>>>>>
> >>>>>>>> Per the Parquet specification requirements, two reference
> >>>>> implementations
> >>>>>>>> are needed to finalize the Variant logical type. Both Java and Go
> >>>>>>>> implementations now support variant encoding and shredding.
> >>>>>>>>
> >>>>>>>> Java already has the encoding and shredding implementations in
> >>> place:
> >>>>>>>> apache/parquet-java#3197 <
> >>>>>>> https://github.com/apache/parquet-java/pull/3197
> >>>>>>>>>
> >>>>>>>> apache/parquet-java#3202 <
> >>>>>>> https://github.com/apache/parquet-java/pull/3202
> >>>>>>>>>
> >>>>>>>> apache/parquet-java#3223
> >>>>>>>> <https://github.com/apache/parquet-java/issues/3223>
> >>>>>>>> apache/parquet-java#3211
> >>>>>>>> <https://github.com/apache/parquet-java/issues/3211>
> >>>>>>>>
> >>>>>>>> Go also includes encoding and shredding support:
> >>>>>>>> apache/arrow-go#344 <https://github.com/apache/arrow-go/pull/344
> >>>
> >>>>>>>> apache/arrow-go#434 <https://github.com/apache/arrow-go/pull/434
> >>>
> >>>>>>>>
> >>>>>>>> I propose that we remove the "under development" notes from the
> >>>>>>>> documentation and move forward with finalizing the specification
> >>> (PR
> >>>>> #509
> >>>>>>>> <https://github.com/apache/parquet-format/pull/509>).
> >>>>>>>> This vote will be open for at least 72 hours.
> >>>>>>>>
> >>>>>>>> [ ] +1 Finalize Varint and Shredding Spec
> >>>>>>>> [ ] +0
> >>>>>>>> [ ] -1 Do not release this because...
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> >>
>

Reply via email to