alamb commented on code in PR #8521:
URL: https://github.com/apache/arrow-rs/pull/8521#discussion_r2395920069
##########
parquet-variant-compute/src/arrow_to_variant.rs:
##########
@@ -38,6 +38,98 @@ use parquet_variant::{
use std::collections::HashMap;
use std::ops::Range;
+// ============================================================================
+// Shared traits and helpers for Arrow-to-Variant conversion
+// ============================================================================
+
+/// Zero-cost trait for converting Arrow array values to Variant
+pub(crate) trait ArrowToVariant: Array {
+ fn append_to_variant_builder(
+ &self,
+ builder: &mut impl VariantBuilderExt,
+ index: usize,
+ ) -> Result<(), ArrowError>;
+}
+
+/// Macro to define ArrowToVariant implementations with optional value
transformation
+macro_rules! define_arrow_to_variant {
Review Comment:
I agree this PR doesn't seem to have eliminated much yet
There seems to be two major pieces of functionality
1. Converting elements of Arrow Arrays to the corresponding Variant elements
2. Converting Variant arrays back to Arrow Arrays
It seems to me like `arrow_to_variant.rs` and `unshred_variant.rs` still
have non trivial overlap as you mention. What I was hoping was that we could
somehow use one set of traits / structs for both those operations which would
mean we would get `unshred` support "for free" for other types like
`UnionArray`s
That was the dream at anyways.
After I spent some time reviewing this PR I think I would like to propose
some naming consolidation as we have two functions that are inverses of each
other that are confusingly named;
* `cast_to_variant` which casts arrow arrays to variant
* `variant_to_arrow` which casts variants to arrow arrays
I will make a PR to propose `cast_arrow_to_variant` and
`cast_variant_to_arrow` for the kernel names and we can keep the modules
`variant_to_arrow` and `arrow_to_variant` with the actual code
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]