gatesn commented on issue #7983:
URL: https://github.com/apache/arrow-rs/issues/7983#issuecomment-3165065017

   I'm not sure I understand why this model isn't possible with the pull-based 
reader? I could implement an 
[AsyncFileReader](https://docs.rs/parquet/latest/parquet/arrow/async_reader/trait.AsyncFileReader.html)
 that enqueues Io requests to a channel along with a oneshot callback, and 
returns the oneshot as the future. Now I have a stream of IO requests (that can 
be handled however we like), and a stream of record batches that no longer 
depends on Tokio and can be driven from a futures block_on to expose a sync 
API. This is similar to many other Rust APIs e.g. Postgres, where you're given 
a background connection (IO stream) to spawn: 
https://docs.rs/tokio-postgres/latest/tokio_postgres/
   
   One possible solution to prefetching that we thought of is to have a custom 
implementation of `StreamExt::buffered` that doesn't take a constant value, but 
instead takes a handle to the IO dispatcher and continues to pull futures and 
poll them into the IO prefetching queue reaches a sufficient size.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to