I'll work this week on getting the Go implementation to use the same
testing files and ensure compatibility.

On Sun, Jul 27, 2025, 5:28 PM Aihua Xu <aihu...@gmail.com> wrote:

> Hi all,
>
> Following up on the test effort to validate the compatibility of the
> Variant implementation:
>
> Ryan has contributed test cases
> <https://github.com/apache/parquet-testing/pull/90/files> from Iceberg
> (see PR
> #13654 <https://github.com/apache/iceberg/pull/13654>), which I used to
> verify <https://github.com/apache/parquet-java/pull/3258/> the Variant
> implementation in Parquet-Java. The validation surfaced a few minor issues,
> but overall the results confirm compatibility between the two
> implementations.
>
> Let me know if you have any questions or additional follow-up requests.
>
> Thanks,
>
> Aihua
>
>
>
> On Wed, Jul 23, 2025 at 2:24 AM Andrew Lamb <andrewlam...@gmail.com>
> wrote:
>
> > I agree the parquet-testing repo should have example Parquet files
> storing
> > variants.
> >
> > It was brought to my attention recently that the duckdb folks made some
> > testing files[1] based on the Iceberg test suite.
> >
> > Perhaps we can add those files to parquet-testing as part of [2].
> >
> > I expect we'll get to testing the Rust shredding implementation in 2-3
> > weeks at which time I will likely help try and push this forward. It
> would
> > be great if someone else wanted to help do it beforehand.
> >
> > Andrew
> >
> > [1]: https://github.com/duckdb/duckdb/pull/18224
> > [2]: https://github.com/apache/parquet-testing/issues/75
> >
> > On Wed, Jul 23, 2025 at 1:14 AM Gang Wu <ust...@gmail.com> wrote:
> >
> > > I was under the impression that parquet-testing does not yet have
> Parquet
> > > files with variant type annotations.
> > >
> > > Is this still the case? If not, should we add some (shredded and
> > > unshredded) files produced by Java and Go implementations?
> > >
> > > On Wed, Jul 23, 2025 at 3:18 AM Aihua Xu <aihu...@gmail.com> wrote:
> > >
> > > > Thanks Matt for the comment and working on the GO variant.
> > > >
> > > > Micah, that’s a good point. Let me check out the coverage
> completeness
> > > for
> > > > these two implementations.
> > > >
> > > >
> > > >
> > > > > On Jul 22, 2025, at 10:01 AM, Matt Topol <zotthewiz...@gmail.com>
> > > wrote:
> > > > >
> > > > > Assuming that the files with variants in
> > > > > https://github.com/apache/parquet-testing are generated by
> > > parquet-java,
> > > > > then we at least have confirmed that the Go implementation is able
> to
> > > > read
> > > > > variant files that are written by the Java implementation. So
> there's
> > > at
> > > > > least some testing of the two implementations against each other.
> > > > >
> > > > > --Matt
> > > > >
> > > > >> On Tue, Jul 22, 2025 at 12:29 AM Micah Kornfield <
> > > emkornfi...@gmail.com
> > > > >
> > > > >> wrote:
> > > > >>
> > > > >> Have we tested the two implementations against one another?
> > > > >>
> > > > >>> On Mon, Jul 21, 2025 at 9:14 PM Aihua Xu <aihu...@gmail.com>
> > wrote:
> > > > >>>
> > > > >>> Hi community,
> > > > >>>
> > > > >>> Per the Parquet specification requirements, two reference
> > > > implementations
> > > > >>> are needed to finalize the Variant logical type. Both Java and Go
> > > > >>> implementations now support variant encoding and shredding.
> > > > >>>
> > > > >>> Java already has the encoding and shredding implementations in
> > place:
> > > > >>> apache/parquet-java#3197 <
> > > > >> https://github.com/apache/parquet-java/pull/3197
> > > > >>>>
> > > > >>> apache/parquet-java#3202 <
> > > > >> https://github.com/apache/parquet-java/pull/3202
> > > > >>>>
> > > > >>> apache/parquet-java#3223
> > > > >>> <https://github.com/apache/parquet-java/issues/3223>
> > > > >>> apache/parquet-java#3211
> > > > >>> <https://github.com/apache/parquet-java/issues/3211>
> > > > >>>
> > > > >>> Go also includes encoding and shredding support:
> > > > >>> apache/arrow-go#344 <https://github.com/apache/arrow-go/pull/344
> >
> > > > >>> apache/arrow-go#434 <https://github.com/apache/arrow-go/pull/434
> >
> > > > >>>
> > > > >>> I propose that we remove the "under development" notes from the
> > > > >>> documentation and move forward with finalizing the specification
> > (PR
> > > > #509
> > > > >>> <https://github.com/apache/parquet-format/pull/509>).
> > > > >>> This vote will be open for at least 72 hours.
> > > > >>>
> > > > >>> [ ] +1 Finalize Varint and Shredding Spec
> > > > >>> [ ] +0
> > > > >>> [ ] -1 Do not release this because...
> > > > >>>
> > > > >>
> > > >
> > >
> >
>

Reply via email to