reidkaufmann commented on issue #7117:
URL: https://github.com/apache/arrow-rs/issues/7117#issuecomment-2651921814

   FYI: I noticed (when trying to determine the best way to filter a wireshark 
trace for S3 traffic) that the pool of addresses Amazon provides (even over a 
short period of time, i.e a minute) is much bigger than the list in a single 
DNS response.
   
   > this trades cost (via number of requests)
   
   Are you suggesting racing every S3 request or just ones that take "too long" 
(the S3 dock linked seems to be recommending aggressive timeouts).  I'm just 
curious because racing all reads has a bandwidth cost.  If we figure out how to 
saturate our network with S3 traffic ([amazon docs say it's 
possible](https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html#maximize)),
 there could be a significant impact (reflected in latency) to overall system 
performance in cache warming/thrashing scenarios, so we might want to use the 
technique judiciously if we start approaching total network bandwidth.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to