Ten0 commented on issue #4886: URL: https://github.com/apache/arrow-rs/issues/4886#issuecomment-1922019175
> I'll try to get what I have polished up over the next few days, and we can compare benchmarks. Here's a quick POC for for full-featured Avro to Arrow using `serde_avro_fast`, `serde_arrow` and `serde_transcode`: https://github.com/Ten0/arrow_serde_avro/blob/0ea1292064f877b210211c09d001e7b7db02fbdf/tests/simple.rs#L60-L61 https://github.com/Ten0/arrow_serde_avro/blob/0ea1292064f877b210211c09d001e7b7db02fbdf/src/lib.rs#L8 It holds in <150 lines total ATM and successfully loads avro object container files to arrow `RecordBatch`. ([Schema conversion](https://github.com/Ten0/arrow_serde_avro/blob/0ea1292064f877b210211c09d001e7b7db02fbdf/src/schema_conversion.rs#L8) is pretty basic ATM but straightforward to add more) Performance of serde_arrow should be very close to zero-cost abstraction since https://github.com/chmp/serde_arrow/pull/120. There's just https://github.com/chmp/serde_arrow/pull/120#discussion_r1468664717, https://github.com/chmp/serde_arrow/pull/120#discussion_r1468667388 and https://github.com/chmp/serde_arrow/issues/92#issuecomment-1895467586 that are clear areas of potential performance improvements for this particular integration, but that's a reasonably simple fix. I'll probably PR that before benchmarks (if @chmp hasn't done it before 🚀) > You might also be interested in [arrow_json::Decoder::serialize](https://docs.rs/arrow-json/50.0.0/arrow_json/reader/struct.Decoder.html#method.serialize) It adds significant intermediate representations in the "tape" thing. It seems pretty clear that is indeed why it's so far behind [in the benchmarks](https://github.com/chmp/serde_arrow/blob/519c6ee4ae74904b17b12616c8400e83ab206faf/Readme.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
