[
https://issues.apache.org/jira/browse/HADOOP-13904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron Fabbri updated HADOOP-13904:
----------------------------------
Attachment: HADOOP-13904-HADOOP-13345.002.patch
Attaching v2 patch:
Add exponential backoff timer to batched DynamoDB operations.
Add scale tests for both MetadataStore implementations.
Also, fix copy/paste typo in parallel test exclusions.
Some output from testing pasted below.. You can compare with the output before
I added the backoff timer:
{noformat}
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 128 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 341
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 487 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 543
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 559 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 609
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 138 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 204
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 331 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 418
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 1124 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 1185
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 1158 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 3 took 1206
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 250 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 0 took 311
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 469 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 1 took 528
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 630 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 2 took 1829
msec
DynamoDBMetadataStore.java:retryBackoff(511)) - Sleeping 886 msec before next
retry
DynamoDBMetadataStore.java:processBatchWriteRequest(481)) - Retry 3 took 931
msec
{noformat}
Tested via included scale test in US West 2.
Also note, this test actually *can* be run in parallel with others since it
uses its own (fake) bucket name. This patch is based on top of HADOOP-13876,
so there may be jenkins failures without it.
> DynamoDBMetadataStore to handle DDB throttling failures through retry policy
> ----------------------------------------------------------------------------
>
> Key: HADOOP-13904
> URL: https://issues.apache.org/jira/browse/HADOOP-13904
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: HADOOP-13345
> Reporter: Steve Loughran
> Assignee: Aaron Fabbri
> Attachments: HADOOP-13904-HADOOP-13345.001.patch,
> HADOOP-13904-HADOOP-13345.002.patch
>
>
> When you overload DDB, you get error messages warning of throttling, [as
> documented by
> AWS|http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html#Programming.Errors.MessagesAndCodes]
> Reduce load on DDB by doing a table lookup before the create, then, in table
> create/delete operations and in get/put actions, recognise the error codes
> and retry using an appropriate retry policy (exponential backoff + ultimate
> failure)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]