[
https://issues.apache.org/jira/browse/CXF-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752904#comment-17752904
]
Leo Wörteler commented on CXF-8885:
-----------------------------------
We've run into this bug after upgrading to CXF 4.0.1 in KNIME Analytics
Platform 5.1. Long-running sessions that e.g. make REST requests in a loop are
running out of memory because of thousands of live SelectorManager threads.
After some intense debugging and heap-dump studying we seem to have identified
the culprit: The SelectorManager thread of an {{HttpClient}} is supposed to be
shut down after the {{{}HttpClient{}}}'s outer shell, the
{{{}HttpClientFacade{}}}, has been garbage collected. But through some subtle
details of Java's implementation of anonymous inner classes, a reference to the
{{HttpClientFacade}} was leaked into the SelectorManager thread through the
{{ProxySelector}} created in {{{}HttpClientHTTPConduit{}}}. This prevents the
facade from being garbage collected, so the thread can never shut down either.
I've opened a [PR|https://github.com/apache/cxf/pull/1377] against the {{main}}
branch on GitHub that uses a static inner class instead of the anonymous,
non-static one, which prevents the leak.
As an immediate workaround, we are now closing all {{{}WebClient{}}}s via their
(non-{{{}AutoCloseable{}}}) {{close()}} method and are using a
{{ClientLifeCycleManager}} to get notified when they are shutting down. At this
point we are setting the {{HttpClientHTTPConduit#client}} to {{null}} with code
equivalent to this:
{code:java}
@Override
public void clientDestroyed(final Client client) {
final var conduit = client.getConduit();
if (conduit instanceof HttpClientHTTPConduit) {
try {
final Field clientField =
HttpClientHTTPConduit.class.getDeclaredField("client");
clientField.setAccessible(true);
clientField.set(conduit, null);
} catch (NoSuchFieldException | SecurityException |
IllegalArgumentException | IllegalAccessException ex) {
LOGGER.log(Level.WARNING, "...", ex);
}
}
}
{code}
This breaks the reference chain from the SelectorManager thread to the facade,
so the thread can shut down.
> HttpClient SelectorManager threads run indefinitely causing OOM
> ---------------------------------------------------------------
>
> Key: CXF-8885
> URL: https://issues.apache.org/jira/browse/CXF-8885
> Project: CXF
> Issue Type: Bug
> Components: Transports
> Affects Versions: 4.0.0, 3.6.0
> Reporter: Cardo Eggert
> Priority: Major
> Attachments: image (5).png
>
>
> Probably caused by https://issues.apache.org/jira/browse/CXF-8840 .
> Started to notice that when updating from 3.5.x to 3.6.0 that our servers
> started getting OOM. Noticed from the resulting logs that a lot of threads
> were active that were in the format
> HttpClient-<NR>-SelectorManager
> when reverted to 3.5.6 then it did not occur anymore.
>
> Tried to use VirtualVM when debugging it and saw when the thread was started,
> it never died, basically meaning that it ran indefinitely. OOM happened when
> there were about over 1000 of these threads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)