ariel-miculas opened a new issue, #9668:
URL: https://github.com/apache/arrow-rs/issues/9668

   Note we have also been trying to separate the IO from the decoding in 
Parquet -- see 
https://docs.rs/parquet/58.0.0/parquet/arrow/push_decoder/struct.ParquetPushDecoder.html
   
   Perhaps we could move Avro to that model too rather than implementing the 
async stuff first
   
   _Originally posted by @alamb in 
https://github.com/apache/arrow-rs/issues/9632#issuecomment-4194635214_
   
   Issue:
   The async avro reader reads all the data upfront, even though the avro file 
format is serial, thus decoding and data fetching could happen in parallel 
(like datafusion's json scan, for example).
   
   One potential solution: use an async stream, as presented in 
https://github.com/apache/arrow-rs/pull/9632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to