I was under the impression that parquet-testing does not yet have Parquet files with variant type annotations.
Is this still the case? If not, should we add some (shredded and unshredded) files produced by Java and Go implementations? On Wed, Jul 23, 2025 at 3:18 AM Aihua Xu <aihu...@gmail.com> wrote: > Thanks Matt for the comment and working on the GO variant. > > Micah, that’s a good point. Let me check out the coverage completeness for > these two implementations. > > > > > On Jul 22, 2025, at 10:01 AM, Matt Topol <zotthewiz...@gmail.com> wrote: > > > > Assuming that the files with variants in > > https://github.com/apache/parquet-testing are generated by parquet-java, > > then we at least have confirmed that the Go implementation is able to > read > > variant files that are written by the Java implementation. So there's at > > least some testing of the two implementations against each other. > > > > --Matt > > > >> On Tue, Jul 22, 2025 at 12:29 AM Micah Kornfield <emkornfi...@gmail.com > > > >> wrote: > >> > >> Have we tested the two implementations against one another? > >> > >>> On Mon, Jul 21, 2025 at 9:14 PM Aihua Xu <aihu...@gmail.com> wrote: > >>> > >>> Hi community, > >>> > >>> Per the Parquet specification requirements, two reference > implementations > >>> are needed to finalize the Variant logical type. Both Java and Go > >>> implementations now support variant encoding and shredding. > >>> > >>> Java already has the encoding and shredding implementations in place: > >>> apache/parquet-java#3197 < > >> https://github.com/apache/parquet-java/pull/3197 > >>>> > >>> apache/parquet-java#3202 < > >> https://github.com/apache/parquet-java/pull/3202 > >>>> > >>> apache/parquet-java#3223 > >>> <https://github.com/apache/parquet-java/issues/3223> > >>> apache/parquet-java#3211 > >>> <https://github.com/apache/parquet-java/issues/3211> > >>> > >>> Go also includes encoding and shredding support: > >>> apache/arrow-go#344 <https://github.com/apache/arrow-go/pull/344> > >>> apache/arrow-go#434 <https://github.com/apache/arrow-go/pull/434> > >>> > >>> I propose that we remove the "under development" notes from the > >>> documentation and move forward with finalizing the specification (PR > #509 > >>> <https://github.com/apache/parquet-format/pull/509>). > >>> This vote will be open for at least 72 hours. > >>> > >>> [ ] +1 Finalize Varint and Shredding Spec > >>> [ ] +0 > >>> [ ] -1 Do not release this because... > >>> > >> >