tustvold commented on issue #5882: URL: https://github.com/apache/arrow-rs/issues/5882#issuecomment-2290891044
> To me it's an open question if every single object_store user should work around this tokio quirk or if we should have a generic solution within object_store I believe the major sticking point in the past has been around how to support streaming get requests and how that interacts with backpressure, but that definitely remains a possibility - https://github.com/apache/arrow-rs/pull/4040#issuecomment-1724082933. However, with regards to DataFusion object_store is only one example of IO that it might be performing. I doubt IOx is the only codebase that is making use of the async catalog support to perform IO. I therefore think a more holistic approach to separating IO and compute may be warranted, that would in turn obviate the need for such at the object_store level. I definitely think it is a massive footgun that DataFusion's async APIs lure people into performing IO on the same runtime as CPU-bound tasks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org