alamb commented on code in PR #8090:
URL: https://github.com/apache/arrow-rs/pull/8090#discussion_r2271215030


##########
parquet-variant-compute/src/cast_to_variant.rs:
##########
@@ -151,6 +151,38 @@ pub fn cast_to_variant(input: &dyn Array) -> 
Result<VariantArray, ArrowError> {
         DataType::FixedSizeBinary(_) => {
             cast_conversion_nongeneric!(as_fixed_size_binary, |v| v, input, 
builder);
         }
+        DataType::Struct(_) => {
+            let struct_array = input.as_struct();
+            for i in 0..struct_array.len() {
+                if struct_array.is_null(i) {
+                    builder.append_null();
+                    continue;
+                }
+
+                // Create a VariantBuilder for this struct instance
+                let mut variant_builder = VariantBuilder::new();
+                let mut object_builder = variant_builder.new_object();
+
+                // Iterate through all fields in the struct
+                for (field_idx, field_name) in 
struct_array.column_names().iter().enumerate() {
+                    let field_array = struct_array.column(field_idx);
+
+                    // Recursively convert the field value to a variant
+                    if !field_array.is_null(i) {
+                        let field_variant_array = 
cast_to_variant(field_array)?;

Review Comment:
   I think as written this will cast the entire input array for each row, and 
then only read a single row each time, which will likely perform pretty poorly
   
   One way to avoid this might be to slice the field array and only convert the 
slice (e.g. 
   
   ```rust
   let field_array = field_array.slice(i, 1);
   let field_variant_array = cast_to_variant(field_array)?;
   let field_variant = field_variant_array.value(0); // index zero
   ```
   
   Another way that might be even faster could be to recursively cast each 
field once at the start of the function in a pre-traversal order, and then 
traverse the input fields the same way (so you always had access to the current 
field as a variant)
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to