steveloughran commented on pull request #34762:
URL: https://github.com/apache/spark/pull/34762#issuecomment-1005776551


   Only just seen this.
   How much throttling what you actually seeing, and, assuming the s3a client, 
have you set directory marker retention to keep? It is often the actual 
attempted delete of dir markers which trigger the problem ... Effectively it is 
a form of write amplification.
   
   I'm actually thinking that feature to go into s3a this year should be 
configurable rate limiting through the guava RateLimiter; I'm using this in the 
abfs committer to keep committer io below limits where throttling starts to 
cause problems with renames. This is all per process; things like random 
filenames are still going to be critical to spread load on s3.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to