westonpace commented on PR #35440:
URL: https://github.com/apache/arrow/pull/35440#issuecomment-1535591025

   I'm leaving this in draft while I do more profiling.  I have already tested 
the worst case scenario (10k files spread across 10k directories) and it 
improves performance by 10-15x when testing from my desktop to S3.  I've also 
tested the flat scenario (10k files in the bucket with no directories) and 
there is no regression.
   
   I also want to test running from within EC2.  I expect the performance gains 
to be smaller since the request latency is smaller but there should still be 
some gain.
   
   Finally, I want to run some local perf tests with minio.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to