scovich commented on issue #9113:
URL: https://github.com/apache/arrow-rs/issues/9113#issuecomment-3731096862

   > My use case is converting a row-wise format into Arrow which will then be 
manipulated some and eventually written as Parquet. One of the fields is as 
`prost_wkt_types::Struct` which is effectively the gRPC equivalent of JSON and 
is well suited to being converted and carried as a Variant. Since this crate 
only supports JSON, I needed to do that conversion myself. While I haven't 
benchmarked this, data-locality suggests I'm likely best off converting an 
entire row at a time into multiple Arrow arrays which put me roughly into the 
following code structure:
   > 
   > // Create builders for each field
   > let mut ids = StringBuilder::new();
   > let mut properties = VariantArrayBuilder::new();
   > 
   > // loop over rows and add the fields
   > for row in rows {
   >     ids.append_value(row.id);
   >     ...
   > }
   > 
   > // Finish all the builders, eventually returning a RecordBatch
   > ...
   
   That does sound right. You might check out the basic JSON to variant 
converter:
   
https://github.com/apache/arrow-rs/blob/main/parquet-variant-json/src/from_json.rs#L105-L130
   
   Or the arrow-compute JSON to variant converter which uses it (via macro, 
sorry it's harder to read):
   
https://github.com/apache/arrow-rs/blob/main/parquet-variant-compute/src/from_json.rs#L27-L48
   
   Notice how the basic converter takes `impl VariantBuilderExt`, and then the 
compute kernel leverages that to pass `&mut VariantArrayBuilder` via the 
JsonToVariant trait:
   
https://github.com/apache/arrow-rs/blob/main/parquet-variant-json/src/from_json.rs#L66-L79
   
   I realize, looking now, that it's all rather indirect. But in theory, you 
could so something that mimics the `append_json` function from my first link 
above:
   ```rust
   fn append_prost_struct(s: &prost:wkt_types::Struct, &mut impl 
VariantBuilderExt) -> Result<(), ...> {
       /* loop over the fields, recursing on nested structs as needed */
   }
   ```
   and then just:
   ```rust
   for row in rows {
       ids.append_value(row.id);
       append_prost_struct(&row.properties, &mut properties);
   }
   // Finish all the builders
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to