[ 
https://issues.apache.org/jira/browse/HADOOP-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16150539#comment-16150539
 ] 

Steve Loughran commented on HADOOP-13761:
-----------------------------------------

Managed to break tests when working with a bucket whose DDB table was 
precreated at {{ hadoop s3guard init -write 20 -read 20}} & five parallel test 
cases

{code}

Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 230.809 sec <<< 
FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB
testPruneCommandCLI(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)  
Time elapsed: 174.084 sec  <<< ERROR!
com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException: 
The level of configured provisioned throughput for the table was exceeded. 
Consider increasing your provisioning level with the UpdateTable API. (Service: 
AmazonDynamoDBv2; Status Code: 400; Error Code: 
ProvisionedThroughputExceededException; Request ID: 
9NTSEF5S8M3EI7MUN0EV2ERKE3VV4KQNSO5AEMVJF66Q9ASUAAJG)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1588)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1258)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1030)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:742)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:716)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2089)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2065)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeBatchWriteItem(AmazonDynamoDBClient.java:575)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.batchWriteItem(AmazonDynamoDBClient.java:551)
        at 
com.amazonaws.services.dynamodbv2.document.internal.BatchWriteItemImpl.doBatchWriteItem(BatchWriteItemImpl.java:111)
        at 
com.amazonaws.services.dynamodbv2.document.internal.BatchWriteItemImpl.batchWriteItemUnprocessed(BatchWriteItemImpl.java:64)
        at 
com.amazonaws.services.dynamodbv2.document.DynamoDB.batchWriteItemUnprocessed(DynamoDB.java:189)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.processBatchWriteRequest(DynamoDBMetadataStore.java:580)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.prune(DynamoDBMetadataStore.java:761)
        at 
org.apache.hadoop.fs.s3a.s3guard.S3GuardTool$Prune.run(S3GuardTool.java:938)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.exec(AbstractS3GuardToolTestBase.java:277)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.exec(AbstractS3GuardToolTestBase.java:255)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.testPruneCommand(AbstractS3GuardToolTestBase.java:194)
        at 
org.apache.hadoop.fs.s3a.s3guard.AbstractS3GuardToolTestBase.testPruneCommandCLI(AbstractS3GuardToolTestBase.java:206)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
        at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)

Tests run: 62, Failures: 0, Errors: 0, Skipped: 3, Time elapsed: 388.583 sec - 
in org.apache.hadoop.fs.s3a.fileContext.ITestS3AFileContextMainOperations

Results :

{code}

> S3Guard: implement retries 
> ---------------------------
>
>                 Key: HADOOP-13761
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13761
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add 
> retry logic.
> In HADOOP-13651, I added TODO comments in most of the places retry loops are 
> needed, including:
> - open(path).  If MetadataStore reflects recent create/move of file path, but 
> we fail to read it from S3, retry.
> - delete(path).  If deleteObject() on S3 fails, but MetadataStore shows the 
> file exists, retry.
> - rename(src,dest).  If source path is not visible in S3 yet, retry.
> - listFiles(). Skip for now. Not currently implemented in S3Guard. I will 
> create a separate JIRA for this as it will likely require interface changes 
> (i.e. prefix or subtree scan).
> We may miss some cases initially and we should do failure injection testing 
> to make sure we're covered.  Failure injection tests can be a separate JIRA 
> to make this easier to review.
> We also need basic configuration parameters around retry policy.  There 
> should be a way to specify maximum retry duration, as some applications would 
> prefer to receive an error eventually, than waiting indefinitely.  We should 
> also be keeping statistics when inconsistency is detected and we enter a 
> retry loop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to