[ https://issues.apache.org/jira/browse/HTTPCLIENT-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011851#comment-18011851 ]
Cenk Pekyaman edited comment on HTTPCLIENT-2386 at 8/4/25 12:16 PM: -------------------------------------------------------------------- [~abernal], even though the core issue seems to be inconsistency between classic and async, which is solved by pull-request#694, [https://github.com/apache/httpcomponents-core/pull/541] can still be useful in general, I think. most people assume they couldn't even connect when they see a `connectTimeoutException`, and assume they could connect but couldn't receive a response in time if they see a `socketTimeoutException`. it seems initial tls handshake is somewhat in the middle, and distinguishing it via a different message or exception can prove useful in diagnosing failures more easily. it can also help with some retry behaviour as a generic `socketTimeoutException` is not necessarily safe to retry if you don't have an idemponent request, but the other type of timeouts can be. but I don't know if this ticket is the correct place to fix / improve that part. was (Author: JIRAUSER310612): [~abernal], even though the core issue seems to be inconsistency between classic and async, which is solved by pull-request#694, [https://github.com/apache/httpcomponents-core/pull/541] can still be useful in general, I think. most people assume they couldn't even connect when they see a `connectTimeoutException`, and assume they could connect but couldn't receive a response in time if they see a `socketTimeoutException`. it seems initial tls handshake is somewhat in the middle, and distinguishing it via a different message or exception can prove useful in diagnosing failures more easily. it can also help with some retry behaviour as a generic `socketTimeoutException` is not necessarily safe to retry if you don't have an idemponent request, but the other type of timeouts can be. > httpClient5 socketTimeoutException in dataChannel with connectTimeout value > --------------------------------------------------------------------------- > > Key: HTTPCLIENT-2386 > URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2386 > Project: HttpComponents HttpClient > Issue Type: Improvement > Components: HttpClient (async) > Affects Versions: 5.4.4 > Environment: httpclient5 version: 5.4.4 > httpCore5 version: 5.3.4 > Java version: 17 (21 since recently) > Reporter: Cenk Pekyaman > Priority: Minor > Time Spent: 1h 40m > Remaining Estimate: 0h > > We are using http-client-5 under our api-client layer with async + TLS. Once > in a while, we get a timeout exception like this: > "{*}java.net.SocketTimeoutException: 2 SECONDS{*}" in some requests. and from > the stack trace, we see that the exception is coming from > "InternalDataChannel.onTimeout". but the value we see is actually the > {*}connectTimeout{*}, not the readTimeout({*}socketTimeout{*}) we set for the > request. > we specify our *connectTimeout* as *2 seconds* in > *PoolingAsyncClientConnectionManagerBuilder* during client creation. we > specify our *socketTimeout* as *10 seconds* in *RequestConfig* of > *HttpClientContext* during request execution. > > this is how we make the call (a bit simplified): > {{// the last parameter is our class implementing FutureCallback}} > {{Future<Message<HttpResponse, RES>> clientFuture = > asyncHttpClient.execute(asyncRequestProducer, responseConsumer, null, > httpClientContext, this);}} > {{// callTimeout is essentially connectTimeout + socketTimeout by default.}} > {{return clientFuture.get(callTimeout, TimeUnit.MILLISECONDS);}} > > We suspect that the client completes the initial connect step with > "InternalConnectChannel" and then the socket is handed over to > "InternalDataChannel" to do the TLS handshake in "{*}startTls{*}", which > either takes too long to complete, or hangs. Since we don't specify > handshakeTimeout in our tlsconfig, *DefaultAsyncClientConnectionOperator* > seems to be using *connectTimeout* for this stage (in the callback called > after connection is established ?). > > Code flow is essentially correct but getting a generic socketTimeoutException > with the value of connectTimeout seems a bit confusing, as we normally expect > some ConnectTimeoutException if we really have a connection timeout, or the > other way around. > > This is an example stack trace for such a case: > {{ at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient$2.failed(InternalAbstractHttpAsyncClient.java:364) > [httpclient5-5.4.4.jar:5.4.4] at > org.apache.hc.client5.http.impl.async.AsyncRedirectExec$1.failed(AsyncRedirectExec.java:246) > [httpclient5-5.4.4.jar:5.4.4] at > org.apache.hc.client5.http.impl.async.AsyncProtocolExec$1.failed(AsyncProtocolExec.java:295) > [httpclient5-5.4.4.jar:5.4.4] at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$2.failed(AsyncConnectExec.java:233) > [httpclient5-5.4.4.jar:5.4.4] at > org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$4.failed(PoolingAsyncClientConnectionManager.java:482) > [httpclient5-5.4.4.jar:5.4.4] at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.FutureContribution.failed(FutureContribution.java:52) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.ssl.SSLIOSession$1.exception(SSLIOSession.java:233) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.ssl.SSLIOSession$1.timeout(SSLIOSession.java:223) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.InternalDataChannel.onTimeout(InternalDataChannel.java:170) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.InternalChannel.checkTimeout(InternalChannel.java:67) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.SingleCoreIOReactor.checkTimeout(SingleCoreIOReactor.java:239) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.SingleCoreIOReactor.validateActiveChannels(SingleCoreIOReactor.java:166) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.SingleCoreIOReactor.doExecute(SingleCoreIOReactor.java:128) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.AbstractSingleCoreIOReactor.execute(AbstractSingleCoreIOReactor.java:92) > [httpcore5-5.3.4.jar:5.3.4] at > org.apache.hc.core5.reactor.IOReactorWorker.run(IOReactorWorker.java:44) > [httpcore5-5.3.4.jar:5.3.4] at java.lang.Thread.run(Thread.java:840) [?:?] > Caused by: java.net.SocketTimeoutException: 2 SECONDS at > org.apache.hc.core5.io.SocketTimeoutExceptionFactory.create(SocketTimeoutExceptionFactory.java:50) > ~[httpcore5-5.3.4.jar:5.3.4]}} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org