[ 
https://issues.apache.org/jira/browse/HADOOP-14810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145104#comment-16145104
 ] 

Steve Loughran commented on HADOOP-14810:
-----------------------------------------

Managed to break tests when working with a bucket whose DDB table was 
precreated at {{ hadoop s3guard init -write 20 -read 20}} & five parallel test 
cases

{code}
testRollingRenames(org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency)  Time 
elapsed: 22.544 sec  <<< ERROR!
org.apache.hadoop.fs.s3a.AWSServiceIOException: listChildren on 
s3a://hwdev-steve-ireland-new/test/rolling/2: 
com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException: 
The level of configured provisioned throughput for the table was exceeded. 
Consider increasing your provisioning level with the UpdateTable API. (Service: 
AmazonDynamoDBv2; Status Code: 400; Error Code: 
ProvisionedThroughputExceededException; Request ID: 
TRCOPRJ8KVD4F22178GTOP5AIBVV4KQNSO5AEMVJF66Q9ASUAAJG): The level of configured 
provisioned throughput for the table was exceeded. Consider increasing your 
provisioning level with the UpdateTable API. (Service: AmazonDynamoDBv2; Status 
Code: 400; Error Code: ProvisionedThroughputExceededException; Request ID: 
TRCOPRJ8KVD4F22178GTOP5AIBVV4KQNSO5AEMVJF66Q9ASUAAJG)
{code}
Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 230.809 sec <<< 
FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB
testPruneCommandCLI(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)  
Time elapsed: 174.084 sec  <<< ERROR!
com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException: 
The level of configured provisioned throughput for the table was exceeded. 
Consider increasing your provisioning level with the UpdateTable API. (Service: 
AmazonDynamoDBv2; Status Code: 400; Error Code: 
ProvisionedThroughputExceededException; Request ID: 
9NTSEF5S8M3EI7MUN0EV2ERKE3VV4KQNSO5AEMVJF66Q9ASUAAJG)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1588)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2089)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2065)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeBatchWriteItem(AmazonDynamoDBClient.java:575)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.batchWriteItem(AmazonDynamoDBClient.java:551)
        at 
com.amazonaws.services.dynamodbv2.document.internal.BatchWriteItemImpl.doBatchWriteItem(BatchWriteItemImpl.java:111)
        at 
com.amazonaws.services.dynamodbv2.document.internal.BatchWriteItemImpl.batchWriteItemUnprocessed(BatchWriteItemImpl.java:64)
        at 
com.amazonaws.services.dynamodbv2.document.DynamoDB.batchWriteItemUnprocessed(DynamoDB.java:189)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.processBatchWriteRequest(DynamoDBMetadataStore.java:580)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.prune(DynamoDBMetadataStore.java:761)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Prune.run(S3GuardTool.java:938)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.exec(AbstractS3GuardToolTestBase.java:277)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.exec(AbstractS3GuardToolTestBase.java:255)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.testPruneCommand(AbstractS3GuardToolTestBase.java:194)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.testPruneCommandCLI(AbstractS3GuardToolTestBase.java:206)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
        at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

Tests run: 62, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 388.583 sec - 
in org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations

Results :

{code}

> S3Guard: handle provisioning failure through backoff & retry (& metrics)
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-14810
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14810
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>
> S3Guard can't handle overloaded tables.
> I think we all though the API did: it doesn't; exceptions get raised and the 
> caller is expected to handle it.
> This relates very much to the s3a-lambda invocation code in HADOOP-13786 to 
> handle failures during commit, and the need for all the S3AFileSystem calls 
> of the S3 APIs to handle transient failures like throttling, and again, needs 
> some fault injection to verify the handling, metrics to count rate so it can 
> be monitored  & used to understand why work is underperforming.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to