mrocklin commented on issue #38389:
URL: https://github.com/apache/arrow/issues/38389#issuecomment-1779469247

   For S3, we've found that 2-3x numcpus is pretty good.  One get about 50MB/s 
per S3 connection, and total aggregate S3 bandwidth on Amazon is correlated 
with machine size (larger machines with more cores have more bandwidth).  This 
scales linearly for modestly sized machines, such that 2-3x ends up being a 
good general rule.
   
   This is made more explicit at the top of the notebook I shared (using more 
threads with raw S3 access results in greater aggregate bandwidth).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to