[ 
https://issues.apache.org/jira/browse/HADOOP-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron Fabbri updated HADOOP-13904:
----------------------------------
    Attachment: HADOOP-13904-HADOOP-13345.002.patch

Attaching v2 patch:

Add exponential backoff timer to batched DynamoDB operations.
Add scale tests for both MetadataStore implementations.
Also, fix copy/paste typo in parallel test exclusions.

Some output from testing pasted below..  You can compare with the output before 
I added the backoff timer:

{noformat}
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 128 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 341 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 487 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 543 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 559 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 609 
msec

DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 138 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 204 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 331 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 418 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 1124 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 1185 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 1158 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 3 took 1206 
msec

DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 250 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 311 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 469 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 528 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 630 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 1829 
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 886 msec before next 
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 3 took 931 
msec
{noformat}

Tested via included scale test in US West 2.

Also note, this test actually *can* be run in parallel with others since it 
uses its own (fake) bucket name.  This patch is based on top of HADOOP-13876, 
so there may be jenkins failures without it.

> DynamoDBMetadataStore to handle DDB throttling failures through retry policy
> ----------------------------------------------------------------------------
>
>                 Key: HADOOP-13904
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13904
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13904-HADOOP-13345.001.patch, 
> HADOOP-13904-HADOOP-13345.002.patch
>
>
> When you overload DDB, you get error messages warning of throttling, [as 
> documented by 
> AWS|http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html#Programming.Errors.MessagesAndCodes]
> Reduce load on DDB by doing a table lookup before the create, then, in table 
> create/delete operations and in get/put actions, recognise the error codes 
> and retry using an appropriate retry policy (exponential backoff + ultimate 
> failure) 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to