[ 
https://issues.apache.org/jira/browse/HADOOP-15583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541668#comment-16541668
 ] 

Steve Loughran commented on HADOOP-15583:
-----------------------------------------

new DDB stack trace in {{ITestS3GuardConcurrentOps}} testing this; connect 
timeout = 15000. Assume unrelated and just a function of parallel load of a 
test run with {{-Dparallel-tests -DtestsThreadCount=6 -Ds3guard -Ddynamodb }} 
and network delays

{code}
ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 52.5 s 
<<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps
[ERROR] 
testConcurrentTableCreations(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps)
  Time elapsed: 52.384 s  <<< FAILURE!
java.lang.AssertionError: 1/16 threads threw exceptions while initializing on 
iteration 2
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps.testConcurrentTableCreations(ITestS3GuardConcurrentOps.java:172)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
Caused by: java.lang.IllegalArgumentException: Table 
testConcurrentTableCreations578159635 did not transition into ACTIVE state.
        at 
com.amazonaws.services.dynamodbv2.document.Table.waitForActive(Table.java:489)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.waitForTableActive(DynamoDBMetadataStore.java:1040)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initTable(DynamoDBMetadataStore.java:931)
        at 
org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initialize(DynamoDBMetadataStore.java:359)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps$2.call(ITestS3GuardConcurrentOps.java:145)
        at 
org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardConcurrentOps$2.call(ITestS3GuardConcurrentOps.java:136)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: 
Connect to dynamodb.eu-west-1.amazonaws.com:443 
[dynamodb.eu-west-1.amazonaws.com/52.94.25.90] failed: connect timed out
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1114)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1064)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2925)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2901)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:1515)
        at 
com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:1491)
        at 
com.amazonaws.services.dynamodbv2.waiters.DescribeTableFunction.apply(DescribeTableFunction.java:53)
        at 
com.amazonaws.services.dynamodbv2.waiters.DescribeTableFunction.apply(DescribeTableFunction.java:24)
        at 
com.amazonaws.waiters.WaiterExecution.getCurrentState(WaiterExecution.java:102)
        at 
com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:74)
        at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
        at 
com.amazonaws.services.dynamodbv2.document.Table.waitForActive(Table.java:482)
        ... 9 more
Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectTimeoutException: 
Connect to dynamodb.eu-west-1.amazonaws.com:443 
[dynamodb.eu-west-1.amazonaws.com/52.94.25.90] failed: connect timed out
        at 
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:150)
        at 
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
        at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
        at com.amazonaws.http.conn.$Proxy15.connect(Unknown Source)
        at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:380)
        at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
        at 
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:184)
        at 
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:184)
        at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82)
        at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:55)
        at 
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1236)
        at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
        ... 25 more
Caused by: java.net.SocketTimeoutException: connect timed out
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at 
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:337)
        at 
com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:132)
        at 
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:141)
        ... 40 more
{code}

> Stabilize S3A Assumed Role support
> ----------------------------------
>
>                 Key: HADOOP-15583
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15583
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.1.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>         Attachments: HADOOP-15583-001.patch, HADOOP-15583-002.patch
>
>
> started off just on sharing credentials across S3A and S3Guard, but in the 
> process it has grown to becoming one of stabilising the assumed role support 
> so it can be used for more than just testing.
> Was: "S3Guard to get AWS Credential chain from S3AFS; credentials closed() on 
> shutdown"
> h3. Issue: lack of auth chain sharing causes ddb and s3 to get out of sync
> S3Guard builds its DDB auth chain itself, which stops it having to worry 
> about being created standalone vs part of an S3AFS, but it means its 
> authenticators are in a separate chain.
> When you are using short-lived assumed roles or other session credentials 
> updated in the S3A FS authentication chain, you need that same set of 
> credentials picked up by DDB. Otherwise, at best you are doubling load, at 
> worse: the DDB connector may not get refreshed credentials.
> Proposed: {{DynamoDBClientFactory.createDynamoDBClient()}} to take an 
> optional ref to aws credentials. If set: don't create a new set. 
> There's one little complication here: our {{AWSCredentialProviderList}} list 
> is autocloseable; it's close() will go through all children and close them. 
> Apparently the AWS S3 client (And hopefully the DDB client) will close this 
> when they are closed themselves. If DDB  has the same set of credentials as 
> the FS, then there could be trouble if they are closed in one place when the 
> other still wants to use them.
> Solution; have a use count the uses of the credentials list, starting at one: 
> every close() call decrements, and when this hits zero the cleanup is kicked 
> off
> h3. Issue: {{AssumedRoleCredentialProvider}} connector to STS not picking up 
> the s3a connection settings, including proxy.
> h3. issue: we're not using getPassword() to get user/password for proxy 
> binding for STS. Fix: use that and pass down the bucket ref for per-bucket 
> secrets in a JCEKS file.
> h3. Issue; hard to debug what's going wrong :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to