Re: [D] Using arrow and parquet_derive [arrow-rs]

via GitHub Wed, 24 Dec 2025 19:06:48 -0800


GitHub user ross-paypay closed a discussion: Using arrow and parquet_derive


Sorry if this a basic question but I can't seem to find an answer for this.

I have a Vec of custom structs that I am using to construct a `RecordBatch` by 
manually building each of the columns.

Below is a simplified example

```rs
struct Datum { foo: String }

let data: Vec<Datum> = vec![....]

let foo = Arc::new(StringArray::from(data.iter().map(|v| 
v.foo).collect::<Vec<_>>())) as ArrayRef

RecordBatch::try_from_iter_with_nullable(vec![
        ("foo", foo, true),
]);
```

This is working, but my data has 50+ fields so the mapping code is getting a 
bit rough to maintain.

Looking at `parquet_derive`, it seems like it could solve this by allowing me 
to derive the the implementation.
However, the parquet 
[ParquetRecordWriter](https://docs.rs/parquet_derive/latest/parquet_derive/derive.ParquetRecordWriter.html)
 seems quiet different than the arrow `RecordBatch` API.

Is there a way to have these interact in a nice way? (or perhaps another 
solution in arrow I am over looking)



GitHub link: https://github.com/apache/arrow-rs/discussions/7840

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Re: [D] Using arrow and parquet_derive [arrow-rs]

Reply via email to