Hi Stephan

Please raise a ticket in the project JIRA for this issue. Please do attach the log exhibiting the probme with the exception stack trace in it to the ticket.

And yes, a reproducer of some sort would be very helpful.

Unrelated to the problem, one should probably be using minimal HttpClient implementation or even straight HttpCore if they do not need advanced client HTTP features and would like to get the maximum throughput in therms of message exchanges over the same time period.

Oleg


On 9/19/2025 9:02 AM, Stephan Epping wrote:
Hello,

I have investigated a tricky error we experienced multiple times in our benchmarks (that use the httpclient to send thousands of requests to our clusters).

We are using a quite up-to-date version (5.3.4 / 5.5.).

You can find more details about the analysis here <https://github.com/ camunda/camunda/issues/34597#issuecomment-3301797932>.

*TLDR;*

The stacktrace shows a tight synchronous callback cycle inside HttpComponents' async path that repeatedly alternates between / completed/ → /release/discard/fail/ → /connect/proceedToNextHop/ → / completed/, causing unbounded recursion until the JVM stack overflows.

Concretely the cycle is:

·AsyncConnectExec$1.completed → InternalHttpAsyncExecRuntime$1.completed → BasicFuture.completed

·PoolingAsyncClientConnectionManager lease/completed → *StrictConnPool.fireCallbacks* → StrictConnPool.release → PoolingAsyncClientConnectionManager.release

·InternalHttpAsyncExecRuntime.discardEndpoint → InternalAbstractHttpAsyncClient$2.failed → AsyncRedirectExec/ AsyncHttpRequestRetryExec/AsyncProtocolExec/AsyncConnectExec.*failed → BasicFuture.failed / ComplexFuture.failed → PoolingAsyncClientConnectionManager$4.failed → DefaultAsyncClientConnectionOperator$1.failed → MultihomeIOSessionRequester.connect → DefaultAsyncClientConnectionOperator.connect → PoolingAsyncClientConnectionManager.connect → InternalHttpAsyncExecRuntime.connectEndpoint → AsyncConnectExec.proceedToNextHop → *back to* AsyncConnectExec$1.completed.

Because callbacks (completed / failed) are executed synchronously on the same call stack and some code paths both /complete/ and then trigger / failed//retry/next-hop connection logic (via pool callbacks and the connection operator), the call stack never unwinds — recursion depth grows until StackOverflowError.

*Possible concrete root causes (detailed)*

1.*Synchronous **BasicFuture** callbacks*
BasicFuture.completed() and .failed() call callbacks immediately on the thread that completes the future. If a callback in turn calls pool release() which calls fireCallbacks() (synchronously), the chain can re- enter callback code without unwinding. Re-entrancy depth grows with each attempted connect/release cycle.

2.*Multihome connect tries multiple addresses in the same stack*
MultihomeIOSessionRequester.connect will attempt alternate addresses (A/ AAAA records). If an address fails quickly and the code immediately tries the next address by invoking connection manager code and its callbacks synchronously, you build deeper recursion for each try.

3.*Retries/redirects executed synchronously*
The exec chain (redirect → retry → protocol → connect) will call failed() listeners which in turn call connect again. If those calls are synchronous, you get direct recursive invocation.

4.*Potential omission of an async boundary*
A simple but dangerous pattern is: /complete future/ → /call listener/ → /listener calls code that completes other futures/ → repeat. If there is no executor handoff, the recursion remains on the same thread.

I haven’t been able to create a unit test that reproduces the issue locally, even though I tried multiple approaches (synthetic http server that is flaky, randomly failing custom dns resolver, thousands of requests scheduled, etc.).

Does someone have an idea what we are doing wrong? Is this a bug or misconfiguration on our side? We switched now to the `LAX` concurrency policy which seems to mitigate the issue, but I believe it’s not fixing the root cause, but makes it less likely. (I can see the lax pool also has the sync fireCallbacks approach etc.)

I have attached a stacktrace, but here a brief excerpt (as I don’t know if attachments work in this mailing list):

/ERROR 2025-06-30T17:06:10.233570881Z Exception in thread "httpclient- dispatch-1" java.lang.StackOverflowError/

/--------------------------------------------------------------------------------/

/ERROR 2025-06-30T17:06:10.299219300Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:164)/

/ERROR 2025-06-30T17:06:10.299221745Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:153)/

/ERROR 2025-06-30T17:06:10.299223952Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:128)/

/ERROR 2025-06-30T17:06:10.299226128Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:120)/

/ERROR 2025-06-30T17:06:10.299228287Z at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/

/ERROR 2025-06-30T17:06:10.299230488Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.leaseCompleted(PoolingAsyncClientConnectionManager.java:339)/

/ERROR 2025-06-30T17:06:10.299232722Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:324)/

/ERROR 2025-06-30T17:06:10.299234969Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:285)/

/ERROR 2025-06-30T17:06:10.299237136Z at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/

/ERROR 2025-06-30T17:06:10.299239404Z at org.apache.hc.core5.pool.StrictConnPool.fireCallbacks(StrictConnPool.java:401)/

/ERROR 2025-06-30T17:06:10.299241531Z at org.apache.hc.core5.pool.StrictConnPool.release(StrictConnPool.java:272)/

/ERROR 2025-06-30T17:06:10.299243647Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.release(PoolingAsyncClientConnectionManager.java:424)/

/ERROR 2025-06-30T17:06:10.299245815Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:156)/

/ERROR 2025-06-30T17:06:10.299247909Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:180)/

/ERROR 2025-06-30T17:06:10.299250099Z at org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient$2.failed(InternalAbstractHttpAsyncClient.java:363)/

/ERROR 2025-06-30T17:06:10.299252352Z at org.apache.hc.client5.http.impl.async.AsyncRedirectExec$1.failed(AsyncRedirectExec.java:261)/

/ERROR 2025-06-30T17:06:10.299254470Z at org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec$1.failed(AsyncHttpRequestRetryExec.java:195)/

/ERROR 2025-06-30T17:06:10.299256671Z at org.apache.hc.client5.http.impl.async.AsyncProtocolExec$1.failed(AsyncProtocolExec.java:297)/

/ERROR 2025-06-30T17:06:10.299258827Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$2.failed(AsyncConnectExec.java:235)/

/ERROR 2025-06-30T17:06:10.299261062Z at org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52)/

/ERROR 2025-06-30T17:06:10.299263164Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299265335Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299267498Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$4.failed(PoolingAsyncClientConnectionManager.java:485)/

/ERROR 2025-06-30T17:06:10.299273172Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299275385Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299277443Z at org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator$1.failed(DefaultAsyncClientConnectionOperator.java:170)/

/ERROR 2025-06-30T17:06:10.299279661Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299281730Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299283917Z at org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester.connect(MultihomeIOSessionRequester.java:118)/

/ERROR 2025-06-30T17:06:10.299287550Z at org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator.connect(DefaultAsyncClientConnectionOperator.java:115)/

/ERROR 2025-06-30T17:06:10.299290061Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.connect(PoolingAsyncClientConnectionManager.java:456)/

/ERROR 2025-06-30T17:06:10.299292128Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.connectEndpoint(InternalHttpAsyncExecRuntime.java:226)/

/ERROR 2025-06-30T17:06:10.299294317Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec.doProceedToNextHop(AsyncConnectExec.java:222)/

/ERROR 2025-06-30T17:06:10.299296347Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec.proceedToNextHop(AsyncConnectExec.java:197)/

/ERROR 2025-06-30T17:06:10.299298506Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec.access$000(AsyncConnectExec.java:92)/

/ERROR 2025-06-30T17:06:10.299388975Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:164)/

/ERROR 2025-06-30T17:06:10.299401176Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:153)/

/ERROR 2025-06-30T17:06:10.299403967Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:128)/

/ERROR 2025-06-30T17:06:10.299406043Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:120)/

/ERROR 2025-06-30T17:06:10.299408227Z at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/

/ERROR 2025-06-30T17:06:10.299410406Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.leaseCompleted(PoolingAsyncClientConnectionManager.java:339)/

/ERROR 2025-06-30T17:06:10.299412507Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:324)/

/ERROR 2025-06-30T17:06:10.299414382Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:285)/

/ERROR 2025-06-30T17:06:10.299426976Z at org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/

/ERROR 2025-06-30T17:06:10.299429367Z at org.apache.hc.core5.pool.StrictConnPool.fireCallbacks(StrictConnPool.java:401)/

/ERROR 2025-06-30T17:06:10.299431117Z at org.apache.hc.core5.pool.StrictConnPool.release(StrictConnPool.java:272)/

/ERROR 2025-06-30T17:06:10.299433157Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.release(PoolingAsyncClientConnectionManager.java:424)/

/ERROR 2025-06-30T17:06:10.299435324Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:156)/

/ERROR 2025-06-30T17:06:10.299437225Z at org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:180)/

/ERROR 2025-06-30T17:06:10.299439134Z at org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient$2.failed(InternalAbstractHttpAsyncClient.java:363)/

/ERROR 2025-06-30T17:06:10.299451280Z at org.apache.hc.client5.http.impl.async.AsyncRedirectExec$1.failed(AsyncRedirectExec.java:261)/

/ERROR 2025-06-30T17:06:10.299453403Z at org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec$1.failed(AsyncHttpRequestRetryExec.java:195)/

/ERROR 2025-06-30T17:06:10.299455096Z at org.apache.hc.client5.http.impl.async.AsyncProtocolExec$1.failed(AsyncProtocolExec.java:297)/

/ERROR 2025-06-30T17:06:10.299456747Z at org.apache.hc.client5.http.impl.async.AsyncConnectExec$2.failed(AsyncConnectExec.java:235)/

/ERROR 2025-06-30T17:06:10.299458431Z at org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52)/

/ERROR 2025-06-30T17:06:10.299460343Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299462062Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299463899Z at org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$4.failed(PoolingAsyncClientConnectionManager.java:485)/

/ERROR 2025-06-30T17:06:10.299465810Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299467649Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299469389Z at org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator$1.failed(DefaultAsyncClientConnectionOperator.java:170)/

/ERROR 2025-06-30T17:06:10.299471284Z at org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/

/ERROR 2025-06-30T17:06:10.299473097Z at org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/

/ERROR 2025-06-30T17:06:10.299474928Z at org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester.connect(MultihomeIOSessionRequester.java:118)/



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to