PinkCrow007 opened a new pull request, #7404:
URL: https://github.com/apache/arrow-rs/pull/7404

   This is a prototype that explores how we may implement Variant types in 
Parquet, including the structures, and as Arrow extension types. While this 
work is not yet completed, we (the CMU Variant team) are putting up this PR so 
everyone has a centralized place to communicate and coordinate work.
   
   Currently, our implementation is set up as follows:
   
   - In Arrow, we facilitate Variants using a Canonical Extension Type over 
binary types.
   - In Parquet, we add Variant as a Logical Type.
   - We also add a top-level "arrow-variant" to centralize parsing logic, 
similar to arrow-json. 
   
   Our current goal is to PR a minimum Variant with the intent of adding 
shredding functionality later.  Our current roadmap is as follows:
   
   - [x] Round-trip Variant between Arrow and Parquet while preserving data.
   - [x] Implement Variant binary encoding/decoding.
   - [x] Round-trip between JSON and Variant.  
   - [ ] Retrieve Variant data by key.
   - [ ] Verify Variant binary compatibility with existing implementations.
   
   I am the main engineer working on this project, with support from @mprammer, 
@pateljm, and @apavlo. This is our team's first Apache PR. We have been in 
contact with @alamb and @adriangb in the lead-up to this draft PR, and thank 
them both for their insight and support.
   
   # Which issue does this PR close?
   
   [[Parquet] Implement Variant type support in Parquet 
#6736](https://github.com/apache/arrow-rs/issues/6736)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to