[ 
https://issues.apache.org/jira/browse/HADOOP-18410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581975#comment-17581975
 ] 

Steve Loughran commented on HADOOP-18410:
-----------------------------------------


OK, this is bizarre.
The async stream draining only executes (in the other thread) if the original 
invoker waits for the result.

this is not caused by some synchronized conflict -I made the method being 
invoked static to ensure this.

And the logging of the start/finish of the call are present. just not the bit 
in the middle.

h3. test run with the {{join()}} to make it blocking.
{code}
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream 
(S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream 
(S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 
bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted 
operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(761)) - drain or 
abort reason unbuffer() remaining=998 abort=false
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(771)) - draining 
998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(789)) - Drained 
stream of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(801)) - Closing 
stream
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(826)) - Stream 
s3a://stevel-london/test/testUnbufferDraining closed: unbuffer(); remaining=0
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed 
submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] DEBUG 
s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging 
statistics into FS statistics in unbuffer(): ...

2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] INFO 
s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

{code}

But without the join, no joy. even though the log above shows the drain is 
being executed in the new thread.

{code}
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker 
(DurationInfo.java:close(101)) - read: duration 0:00.000s
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream 
(S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream 
(S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 
bytes
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted 
operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG 
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed 
submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010

2022-08-19 18:36:56,110 [JUnit-testUnbufferDraining] DEBUG 
s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging 
statistics into FS statistics in unbuffer():  ...
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] INFO 
s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker 
(DurationInfo.java:<init>(80)) - Starting: lazySeek
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream 
(S3AInputStream.java:reopen(263)) - 
reopen(s3a://stevel-london/test/testUnbufferDraining) for read from new offset 
range[49001-50000], length=1, streamPosition=49002, nextReadPosition=49001, 
policy=random
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG 
impl.ChangeDetectionPolicy 
(ChangeDetectionPolicy.java:applyRevisionConstraint(430)) - Restricting get 
request to version RFW1GsJZAc_mL5eKKWRnB7tX2K9B3I98
2022-08-19 18:36:56,112 [JUnit-testUnbufferDraining] DEBUG impl.LoggingAuditor 
(LoggingAuditor.java:beforeExecution(327)) - [11] 
54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010 Executing op_open with 
{action_http_get_request 'test/testUnbufferDraining' size=998, mutating=false}; 
https://audit.example.org/hadoop/1/op_open/54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010/?op=op_open&p1=test/testUnbufferDraining&pr=stevel&ps=6cd73b0d-cfa1-41ec-ba5a-305b3abff0b0&id=54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010&t0=30&fs=54f63eab-e0d8-48b5-a0c1-9a877576a450&t1=11&ts=1660930615787
2022-08-19 18:36:58,223 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker 
(DurationInfo.java:close(101)) - lazySeek: duration 0:02.112s

// then the timeout surfaces on the next read()

org.apache.hadoop.net.ConnectTimeoutException: re-open 
s3a://stevel-london/test/testUnbufferDraining at 49001 on 
s3a://stevel-london/test/testUnbufferDraining: 
com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout 
waiting for connection from pool

        at 
org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:385)
        at 
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:198)
        at 
org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:149)
        at 
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:276)
        at 
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:429)
        at 
org.apache.hadoop.fs.s3a.Invoker.lambda$maybeRetry$3(Invoker.java:284)
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
        at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:410)
        at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:282)
        at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:326)
        at 
org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:421)
        at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:455)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at 
org.apache.hadoop.fs.s3a.performance.ITestUnbufferDraining.testUnbufferDraining(ITestUnbufferDraining.java:141)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
        at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:750)

Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: 
Timeout waiting for connection from pool
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
at 
org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1600)
at 
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:278)
at 
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:147)
... 26 more
Caused by: 
com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: 
Timeout waiting for connection from pool
at 
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
at 
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
at com.amazonaws.http.conn.$Proxy22.get(Unknown Source)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
at 
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at 
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at 
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at 
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
... 40 more

{code}

> S3AInputStream.unbuffer() async drain not releasing http connections
> --------------------------------------------------------------------
>
>                 Key: HADOOP-18410
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18410
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.3.9
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Major
>              Labels: pull-request-available
>
> Impala tcp-ds setup to s3 is hitting problems with timeout fetching http 
> connections from the s3a fs pool. Disabling s3a async drain makes this 
> problem *go away*. assumption, either those async ops are blocking, or they 
> are not releasing references properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to