alamb commented on PR #542:
URL: 
https://github.com/apache/arrow-rs-object-store/pull/542#issuecomment-3517917553

   > And yes I stated above that this benchmark does not illustrate the 
difference as extremely as what I saw in production. Perhaps it's because in a 
real workload, we're doing a lot more on the runtime than just GET operations, 
and that accentuates the impact of additional polls.
   
   It might make sense to look into using a separate threadpool for CPU and IO 
work. 
   
   For example, you can move all your object store work to a different 
threadpool (tokio runtime) using the 
   
[SpawnedReqwestConnector](https://docs.rs/object_store/latest/object_store/client/struct.SpawnedReqwestConnector.html).
 There is an end to end example in datafusion: 
https://github.com/apache/datafusion/blob/main/datafusion-examples/examples/thread_pools.rs
   
   Something we spent quite a long time at InfluxData was that io/network 
latencies increased substantially with highly concurrent workloads. We 
eventually tracked this down to using the same threadpool (tokio pool) for CPU 
and IO work -- doing so basically starves the IO of the CPU it needs to make 
progress in the TCP state machine , and it seems that then the tcp stack treats 
the system as being congested and slows down traffic. 
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to