GitHub user ross-paypay closed a discussion: Using arrow and parquet_derive
Sorry if this a basic question but I can't seem to find an answer for this.
I have a Vec of custom structs that I am using to construct a `RecordBatch` by
manually building each of the columns.
Below is a simplified example
```rs
struct Datum { foo: String }
let data: Vec<Datum> = vec![....]
let foo = Arc::new(StringArray::from(data.iter().map(|v|
v.foo).collect::<Vec<_>>())) as ArrayRef
RecordBatch::try_from_iter_with_nullable(vec![
("foo", foo, true),
]);
```
This is working, but my data has 50+ fields so the mapping code is getting a
bit rough to maintain.
Looking at `parquet_derive`, it seems like it could solve this by allowing me
to derive the the implementation.
However, the parquet
[ParquetRecordWriter](https://docs.rs/parquet_derive/latest/parquet_derive/derive.ParquetRecordWriter.html)
seems quiet different than the arrow `RecordBatch` API.
Is there a way to have these interact in a nice way? (or perhaps another
solution in arrow I am over looking)
GitHub link: https://github.com/apache/arrow-rs/discussions/7840
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]