Thanks Matt. 

> On Jul 27, 2025, at 2:31 PM, Matt Topol <zotthewiz...@gmail.com> wrote:
> 
> I'll work this week on getting the Go implementation to use the same
> testing files and ensure compatibility.
> 
>> On Sun, Jul 27, 2025, 5:28 PM Aihua Xu <aihu...@gmail.com> wrote:
>> 
>> Hi all,
>> 
>> Following up on the test effort to validate the compatibility of the
>> Variant implementation:
>> 
>> Ryan has contributed test cases
>> <https://github.com/apache/parquet-testing/pull/90/files> from Iceberg
>> (see PR
>> #13654 <https://github.com/apache/iceberg/pull/13654>), which I used to
>> verify <https://github.com/apache/parquet-java/pull/3258/> the Variant
>> implementation in Parquet-Java. The validation surfaced a few minor issues,
>> but overall the results confirm compatibility between the two
>> implementations.
>> 
>> Let me know if you have any questions or additional follow-up requests.
>> 
>> Thanks,
>> 
>> Aihua
>> 
>> 
>> 
>> On Wed, Jul 23, 2025 at 2:24 AM Andrew Lamb <andrewlam...@gmail.com>
>> wrote:
>> 
>>> I agree the parquet-testing repo should have example Parquet files
>> storing
>>> variants.
>>> 
>>> It was brought to my attention recently that the duckdb folks made some
>>> testing files[1] based on the Iceberg test suite.
>>> 
>>> Perhaps we can add those files to parquet-testing as part of [2].
>>> 
>>> I expect we'll get to testing the Rust shredding implementation in 2-3
>>> weeks at which time I will likely help try and push this forward. It
>> would
>>> be great if someone else wanted to help do it beforehand.
>>> 
>>> Andrew
>>> 
>>> [1]: https://github.com/duckdb/duckdb/pull/18224
>>> [2]: https://github.com/apache/parquet-testing/issues/75
>>> 
>>>> On Wed, Jul 23, 2025 at 1:14 AM Gang Wu <ust...@gmail.com> wrote:
>>> 
>>>> I was under the impression that parquet-testing does not yet have
>> Parquet
>>>> files with variant type annotations.
>>>> 
>>>> Is this still the case? If not, should we add some (shredded and
>>>> unshredded) files produced by Java and Go implementations?
>>>> 
>>>> On Wed, Jul 23, 2025 at 3:18 AM Aihua Xu <aihu...@gmail.com> wrote:
>>>> 
>>>>> Thanks Matt for the comment and working on the GO variant.
>>>>> 
>>>>> Micah, that’s a good point. Let me check out the coverage
>> completeness
>>>> for
>>>>> these two implementations.
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jul 22, 2025, at 10:01 AM, Matt Topol <zotthewiz...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>> Assuming that the files with variants in
>>>>>> https://github.com/apache/parquet-testing are generated by
>>>> parquet-java,
>>>>>> then we at least have confirmed that the Go implementation is able
>> to
>>>>> read
>>>>>> variant files that are written by the Java implementation. So
>> there's
>>>> at
>>>>>> least some testing of the two implementations against each other.
>>>>>> 
>>>>>> --Matt
>>>>>> 
>>>>>>> On Tue, Jul 22, 2025 at 12:29 AM Micah Kornfield <
>>>> emkornfi...@gmail.com
>>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Have we tested the two implementations against one another?
>>>>>>> 
>>>>>>>> On Mon, Jul 21, 2025 at 9:14 PM Aihua Xu <aihu...@gmail.com>
>>> wrote:
>>>>>>>> 
>>>>>>>> Hi community,
>>>>>>>> 
>>>>>>>> Per the Parquet specification requirements, two reference
>>>>> implementations
>>>>>>>> are needed to finalize the Variant logical type. Both Java and Go
>>>>>>>> implementations now support variant encoding and shredding.
>>>>>>>> 
>>>>>>>> Java already has the encoding and shredding implementations in
>>> place:
>>>>>>>> apache/parquet-java#3197 <
>>>>>>> https://github.com/apache/parquet-java/pull/3197
>>>>>>>>> 
>>>>>>>> apache/parquet-java#3202 <
>>>>>>> https://github.com/apache/parquet-java/pull/3202
>>>>>>>>> 
>>>>>>>> apache/parquet-java#3223
>>>>>>>> <https://github.com/apache/parquet-java/issues/3223>
>>>>>>>> apache/parquet-java#3211
>>>>>>>> <https://github.com/apache/parquet-java/issues/3211>
>>>>>>>> 
>>>>>>>> Go also includes encoding and shredding support:
>>>>>>>> apache/arrow-go#344 <https://github.com/apache/arrow-go/pull/344
>>> 
>>>>>>>> apache/arrow-go#434 <https://github.com/apache/arrow-go/pull/434
>>> 
>>>>>>>> 
>>>>>>>> I propose that we remove the "under development" notes from the
>>>>>>>> documentation and move forward with finalizing the specification
>>> (PR
>>>>> #509
>>>>>>>> <https://github.com/apache/parquet-format/pull/509>).
>>>>>>>> This vote will be open for at least 72 hours.
>>>>>>>> 
>>>>>>>> [ ] +1 Finalize Varint and Shredding Spec
>>>>>>>> [ ] +0
>>>>>>>> [ ] -1 Do not release this because...
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to