alamb commented on code in PR #7915: URL: https://github.com/apache/arrow-rs/pull/7915#discussion_r2205367028
########## parquet-variant/src/builder.rs: ########## @@ -350,14 +378,59 @@ impl<S: AsRef<str>> FromIterator<S> for MetadataBuilder { } } -impl<S: AsRef<str>> Extend<S> for MetadataBuilder { +impl<S: AsRef<str>> Extend<S> for DefaultMetadataBuilder { fn extend<T: IntoIterator<Item = S>>(&mut self, iter: T) { for field_name in iter { self.upsert_field_name(field_name.as_ref()); } } } +/// Read-only metadata builder that validates field names against an existing metadata dictionary Review Comment: > What do we gain by using views at this low level? The major benefit in my mind is that it means any row level reading of `Variant` is the same for shredded and unshredded Variants. For example we can implement the general purpose `variant_get` kernel in terms of a `Varaint` at first Then we can add the special paths / implementations for `varaint_get` on shredded arrays that are more efficient as follow on work. > If we could get away with "views" here on the unshredding path, why can't the native builder get away with "views" internally to avoid all the repeated copying of bytes as we unwind nested builders? I am not quite sure what you are proposing here, but it sounds good to me > Isolated reaction: I would favor Variant::ShreddedObject as its own enum variant, rather than making Variant::Object more complex. They're pretty fundamentally different cases. I agree What I plan to do is to try to help along https://github.com/apache/arrow-rs/pull/7914 and the variant_get API in https://github.com/apache/arrow-rs/pull/7919 With those in place I think we'll be in a place to try and sketch out how shredded variants could work -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org