[
https://issues.apache.org/jira/browse/HADOOP-18410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17581975#comment-17581975
]
Steve Loughran commented on HADOOP-18410:
-----------------------------------------
OK, this is bizarre.
The async stream draining only executes (in the other thread) if the original
invoker waits for the result.
this is not caused by some synchronized conflict -I made the method being
invoked static to ensure this.
And the logging of the start/finish of the call are present. just not the bit
in the middle.
h3. test run with the {{join()}} to make it blocking.
{code}
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream
(S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream
(S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998
bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted
operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(761)) - drain or
abort reason unbuffer() remaining=998 abort=false
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(771)) - draining
998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(789)) - Drained
stream of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(801)) - Closing
stream
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(826)) - Stream
s3a://stevel-london/test/testUnbufferDraining closed: unbuffer(); remaining=0
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed
submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] DEBUG
s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging
statistics into FS statistics in unbuffer(): ...
2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] INFO
s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -
testUnbufferDraining: Starting read/unbuffer #2
{code}
But without the join, no joy. even though the log above shows the drain is
being executed in the new thread.
{code}
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker
(DurationInfo.java:close(101)) - read: duration 0:00.000s
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream
(S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream
(S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998
bytes
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted
operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG
s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed
submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010
2022-08-19 18:36:56,110 [JUnit-testUnbufferDraining] DEBUG
s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging
statistics into FS statistics in unbuffer(): ...
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] INFO
s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -
testUnbufferDraining: Starting read/unbuffer #2
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker
(DurationInfo.java:<init>(80)) - Starting: lazySeek
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream
(S3AInputStream.java:reopen(263)) -
reopen(s3a://stevel-london/test/testUnbufferDraining) for read from new offset
range[49001-50000], length=1, streamPosition=49002, nextReadPosition=49001,
policy=random
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG
impl.ChangeDetectionPolicy
(ChangeDetectionPolicy.java:applyRevisionConstraint(430)) - Restricting get
request to version RFW1GsJZAc_mL5eKKWRnB7tX2K9B3I98
2022-08-19 18:36:56,112 [JUnit-testUnbufferDraining] DEBUG impl.LoggingAuditor
(LoggingAuditor.java:beforeExecution(327)) - [11]
54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010 Executing op_open with
{action_http_get_request 'test/testUnbufferDraining' size=998, mutating=false};
https://audit.example.org/hadoop/1/op_open/54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010/?op=op_open&p1=test/testUnbufferDraining&pr=stevel&ps=6cd73b0d-cfa1-41ec-ba5a-305b3abff0b0&id=54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010&t0=30&fs=54f63eab-e0d8-48b5-a0c1-9a877576a450&t1=11&ts=1660930615787
2022-08-19 18:36:58,223 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker
(DurationInfo.java:close(101)) - lazySeek: duration 0:02.112s
// then the timeout surfaces on the next read()
org.apache.hadoop.net.ConnectTimeoutException: re-open
s3a://stevel-london/test/testUnbufferDraining at 49001 on
s3a://stevel-london/test/testUnbufferDraining:
com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout
waiting for connection from pool
at
org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:385)
at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:198)
at
org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:149)
at
org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:276)
at
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:429)
at
org.apache.hadoop.fs.s3a.Invoker.lambda$maybeRetry$3(Invoker.java:284)
at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:410)
at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:282)
at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:326)
at
org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:421)
at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:455)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at
org.apache.hadoop.fs.s3a.performance.ITestUnbufferDraining.testUnbufferDraining(ITestUnbufferDraining.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:750)
Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request:
Timeout waiting for connection from pool
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
at
org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1600)
at
org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:278)
at
org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:147)
... 26 more
Caused by:
com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException:
Timeout waiting for connection from pool
at
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
at
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
at com.amazonaws.http.conn.$Proxy22.get(Unknown Source)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
... 40 more
{code}
> S3AInputStream.unbuffer() async drain not releasing http connections
> --------------------------------------------------------------------
>
> Key: HADOOP-18410
> URL: https://issues.apache.org/jira/browse/HADOOP-18410
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.3.9
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
>
> Impala tcp-ds setup to s3 is hitting problems with timeout fetching http
> connections from the s3a fs pool. Disabling s3a async drain makes this
> problem *go away*. assumption, either those async ops are blocking, or they
> are not releasing references properly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]