[GitHub] [hadoop] steveloughran opened a new pull request #1826: HADOOP-16823. Manage S3 Throttling exclusively in S3A client.

GitBox Fri, 31 Jan 2020 03:48:43 -0800

steveloughran opened a new pull request #1826: HADOOP-16823. Manage S3 
Throttling exclusively in S3A client. 
URL: https://github.com/apache/hadoop/pull/1826
 
 
   Currently AWS S3 throttling is initially handled in the AWS SDK, only 
reaching the S3 client code after it has given up.
   
   This means we don't always directly observe when throttling is taking place.
   
   * disables throttling retries in the AWS client Library *for S3 only*
   * add a quantile for the S3 throttle events, as DDB has
   * isolate counters of s3 and DDB throttle events to classify issues better
   * improvements to DDB throttling handling and testing
   
   1. Because we are taking over the AWS retries, we need to expand the initial 
delay en retries and the number of retries we should support before giving up.
   1. I can split the DDB and S3 side of this patch...they came in together as 
once I turned off throttling across all AWS client configs, scale tests against 
a 10 TPS DDB table showed we weren't retrying adequately in some of the tests, 
and retrying inefficiently in listChildren.
   
   
   (reinstatment of #1814 which was accidentally closed)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop] steveloughran opened a new pull request #1826: HADOOP-16823. Manage S3 Throttling exclusively in S3A client.

Reply via email to