[ 
https://issues.apache.org/jira/browse/CXF-8885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17752904#comment-17752904
 ] 

Leo Wörteler commented on CXF-8885:
-----------------------------------

We've run into this bug after upgrading to CXF 4.0.1 in KNIME Analytics 
Platform 5.1. Long-running sessions that e.g. make REST requests in a loop are 
running out of memory because of thousands of live SelectorManager threads.

After some intense debugging and heap-dump studying we seem to have identified 
the culprit: The SelectorManager thread of an {{HttpClient}} is supposed to be 
shut down after the {{{}HttpClient{}}}'s outer shell, the 
{{{}HttpClientFacade{}}}, has been garbage collected. But through some subtle 
details of Java's implementation of anonymous inner classes, a reference to the 
{{HttpClientFacade}} was leaked into the SelectorManager thread through the 
{{ProxySelector}} created in {{{}HttpClientHTTPConduit{}}}. This prevents the 
facade from being garbage collected, so the thread can never shut down either. 
I've opened a [PR|https://github.com/apache/cxf/pull/1377] against the {{main}} 
branch on GitHub that uses a static inner class instead of the anonymous, 
non-static one, which prevents the leak.

As an immediate workaround, we are now closing all {{{}WebClient{}}}s via their 
(non-{{{}AutoCloseable{}}}) {{close()}} method and are using a 
{{ClientLifeCycleManager}} to get notified when they are shutting down. At this 
point we are setting the {{HttpClientHTTPConduit#client}} to {{null}} with code 
equivalent to this:
{code:java}
    @Override
    public void clientDestroyed(final Client client) {
        final var conduit = client.getConduit();
        if (conduit instanceof HttpClientHTTPConduit) {
            try {
                final Field clientField = 
HttpClientHTTPConduit.class.getDeclaredField("client");
                clientField.setAccessible(true);
                clientField.set(conduit, null);
            } catch (NoSuchFieldException | SecurityException | 
IllegalArgumentException | IllegalAccessException ex) {
                LOGGER.log(Level.WARNING, "...", ex);
            }
        }
    }
{code}
This breaks the reference chain from the SelectorManager thread to the facade, 
so the thread can shut down.

> HttpClient SelectorManager threads run indefinitely causing OOM
> ---------------------------------------------------------------
>
>                 Key: CXF-8885
>                 URL: https://issues.apache.org/jira/browse/CXF-8885
>             Project: CXF
>          Issue Type: Bug
>          Components: Transports
>    Affects Versions: 4.0.0, 3.6.0
>            Reporter: Cardo Eggert
>            Priority: Major
>         Attachments: image (5).png
>
>
> Probably caused by https://issues.apache.org/jira/browse/CXF-8840 .
> Started to notice that when updating from 3.5.x to 3.6.0 that our servers 
> started getting OOM. Noticed from the resulting logs that a lot of threads 
> were active that were in the format 
> HttpClient-<NR>-SelectorManager
> when reverted to 3.5.6 then it did not occur anymore.
>  
> Tried to use VirtualVM when debugging it and saw when the thread was started, 
> it never died, basically meaning that it ran indefinitely. OOM happened when 
> there were about over 1000 of these threads.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to