alamb commented on issue #7715: URL: https://github.com/apache/arrow-rs/issues/7715#issuecomment-3029320206
BTW I made some slides to try and show how I think parquet is going to work. https://docs.google.com/presentation/d/1NN583KuJ3nelIrrH64HAASmbRtO9khhNSO3fw0_Nslc/edit?slide=id.g33d6952b95a_0_321#slide=id.g33d6952b95a_0_321 <img width="1231" alt="Image" src="https://github.com/user-attachments/assets/c93bd78e-30cd-455e-b43c-b656d96bac5d" /> I would suggest you begin by sketching out what the APIs look like with an example usecase: 1. You have a variant encoded as a StructArray with metadata/value fields (see slides) 2. You want to "shred" it -- extract some fields from the variant into new columns So maybe we need a "shred" kernel that takes a StructArray in with a shredding specification and produces a StructArray out with the appropriate fields added 🤔 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
