alamb commented on issue #6736:
URL: https://github.com/apache/arrow-rs/issues/6736#issuecomment-2781556103

   I have some news I would like to share here -- it seems that @PinkCrow007 
has actually been working on a variant implementation in parquet (including 
support in arrow-rs as an extension type)
   
   Here is an update from [Martin Prammer](https://www.cs.cmu.edu/~mprammer/) 
(not sure if he has a github handle)
   
   > We've made progress towards implementing the Variant type in both 
Parquet_rs and Arrow_rs and have prepared a document, [shared as a Google 
doc](https://docs.google.com/document/d/1Wv6FOFsZGibdd9hscorzXx0PwAAADQrQHTn36cjJKHA/edit?tab=t.0),
 that details the overall project and our current status. In summary, our 
current prototypes are focused on round-tripping binary data between Parquet 
and Arrow. The Arrow-side Variant is implemented as a CanonicalExtensionType, 
while the Parquet-side Variant is a LogicalType. If you're interested in 
looking at the code early, [Jiaying's 
fork](https://github.com/apache/arrow-rs/compare/main...PinkCrow007:arrow-rs:variant-clean)
 is publically available. Our next goal is to implement binary data 
decoding/encoding to facilitate using Variants as a stand-alone type, which 
will then allow us to implement Variant shredding. While there's still work to 
do before we have the basic functionality for a Variant type, we plan to PR the 
baseline variant and
  then address shredding.
   >
   > At this point, it would be helpful for our team to connect to the broader 
Apache ecosystem's discussion on Variants; Jiaying has already joined the Arrow 
discord, and we're both happy to join any relevant mailing lists. We're also 
soliciting existing Variant implementations that we can use to verify our 
library against.
   
   It seem they also need some example variant data to make faster progress. I 
will go beg some more from the parquet mailing list
   
   It is very exciting to see the momentum picking up


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to