tustvold commented on PR #4040: URL: https://github.com/apache/arrow-rs/pull/4040#issuecomment-1724082933
Looping back to this I think this problem is ill-formed. There are two major use-cases for this functionality: 1. Supporting object_store on threads not managed by tokio 2. Support object_store in systems containing multiple thread pools with different tail latencies The first use-case is better served by integrating tokio at a higher level, e.g. using `Handle::enter` at the thread level. It is unclear how to handle the second use-case at a library level. The use of a second threadpool implies that the primary threadpool may have very high tail latencies. The problem is determining at what point this should result in back pressure on the underlying TCP connection. As written this PR will not change the way that this backpressure occurs, should the task not get scheduled on the high tail latency threadpool, nothing will drain the TCP socket, and TCP backpressure will occur. The approach in https://github.com/apache/arrow-rs/pull/4015 instead uses a queue with capacity for a single chunk, which will delay this TCP backpressure very slightly. You could increase the queue size, or make a more sophisticated queue that buffers a given number of bytes, but you're it is unclear how this should be controlled. Taking a step back this feels like the wrong way to solve this problem, ultimately IO should be segregated from compute at a meaningful application task boundary, rather than at the object_store interface. For example, AsyncFileReader::get_bytes could perform the IO to fetch a given chunk of data on a separate thread pool. This avoids object_store having to make decisions about how much buffering is too much, etc... I am therefore going to close this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
