aihuaxu opened a new pull request, #47835:
URL: https://github.com/apache/arrow/pull/47835

   ### Rationale for this change
   According to the [Variant 
specification](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md),
 the specification_version field must be set to 1 to indicate Variant encoding 
version 1. Currently, this field defaults to 0, which violates the 
specification. Parquet readers that strictly enforce specification version 
validation will fail to read files containing Variant types.
   <img width="624" height="185" alt="image" 
src="https://github.com/user-attachments/assets/b0f1deb9-0301-4b94-a472-17fd9cc0df5d";
 />
   
   ### What changes are included in this PR?
   The change includes defaulting the specification version to 1.
   ### Are these changes tested?
   The change is covered by unit test.
   ### Are there any user-facing changes?
   The Parquet files produced the variant logical type annotation `VARIANT(1)`.
   
   ```
   Schema:
   message schema {
     optional group V (VARIANT(1)) = 1 {
       required binary metadata;
       required binary value;
     }
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to