ariel-miculas opened a new issue, #9668: URL: https://github.com/apache/arrow-rs/issues/9668
Note we have also been trying to separate the IO from the decoding in Parquet -- see https://docs.rs/parquet/58.0.0/parquet/arrow/push_decoder/struct.ParquetPushDecoder.html Perhaps we could move Avro to that model too rather than implementing the async stuff first _Originally posted by @alamb in https://github.com/apache/arrow-rs/issues/9632#issuecomment-4194635214_ Issue: The async avro reader reads all the data upfront, even though the avro file format is serial, thus decoding and data fetching could happen in parallel (like datafusion's json scan, for example). One potential solution: use an async stream, as presented in https://github.com/apache/arrow-rs/pull/9632 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
