Re: Leaking nioEventLoopGroup threads

Julian Feinauer Tue, 25 Aug 2020 07:30:50 -0700

Hi Adam,

sorry form y late response and in facgt thanks for your feedback, happy to hear!
I will merge my branch and close the issue, no worries.


And you are exactly right, just log another issue fort he Pool and we hope that 
someone may find the time to have a look there : )

Julian

PS.: As you already digged into the code you can try to fix it yourself of 
course, I would be more than happy to get a PR from you : )

Von: Adam Rossi <ac.ro...@gmail.com>
Datum: Dienstag, 25. August 2020 um 15:45
An: Julian Feinauer <j.feina...@pragmaticminds.de>
Cc: "dev@plc4x.apache.org" <dev@plc4x.apache.org>
Betreff: Re: Leaking nioEventLoopGroup threads

Juilan,

My apologies - your fix did indeed work. The issue is that 
PooledPlcDriverManager does not seem to be calling the close method on the 
connection. Switching back to PlcDriverManager from PooledPlcDriverManager 
results in your new log comments showing up in the log, and more importantly no 
more leaks of the nioEventLoopGroup threads. I have tested the code in a loop 
for an hour or so and it is working perfectly.

The PooledPlcDriverManager seems to intercept the close method on the 
plcConnection (lines 125 - 130):

            if ("close".equals(method.getName())) {
                LOGGER.debug("close called on {}", plcConnection);
                proxyInvalidated.set(true);
                keyedObjectPool.returnObject(poolKey, plcConnection);
                return null;
            } else {

Which makes sense as it is trying to keep active connections pooled. However, 
when this connection is again retrieved from the pool it seems the 
plcConnection connects again and creates an additional nioEventLoopGroup 
thread, which is never closed.

I am new to working with this project and Jira. It seems to me that I should 
close the issue I just created as your fix does indeed correct the original 
issue, and perhaps open another issue on PooledPlcDriverManager for this thread 
leak?

Regards, Adam

On Mon, Aug 24, 2020 at 4:39 AM Julian Feinauer 
<j.feina...@pragmaticminds.de<mailto:j.feina...@pragmaticminds.de>> wrote:
Hi,

short feedback. I looked into the code and indeed it seems that we had a  bug 
there which could lead tot he socket leak you described.
I pushed a fix in the branch:

https://github.com/apache/plc4x/tree/bugfix/close-eventloop-after-channel

Would you mind taking a look and testing this with your code @Adam Rossi?

Thanks!
Juliasn

Am 24.08.20, 08:26 schrieb "Julian Feinauer" 
<j.feina...@pragmaticminds.de<mailto:j.feina...@pragmaticminds.de>>:

    Perhaps, some related questions:

    - You are using Linux for your Tests?
    - Do you close all Connections properly?
    Normally the `PlcConnection.close()` method should close the EventLoop.

    Julian

    Am 24.08.20, 08:23 schrieb "Julian Feinauer" 
<j.feina...@pragmaticminds.de<mailto:j.feina...@pragmaticminds.de>>:

        Hi Adam,

        I will have a look today!

        Do we have a Jira Issue for it already?

        Julian

        Am 24.08.20, 07:38 schrieb "Christofer Dutz" 
<christofer.d...@c-ware.de<mailto:christofer.d...@c-ware.de>>:

            Hi Adam,

            of course that's unfortunate ... also I will not be able to address 
this issue soon as I have to work on the tasks of my research project.
            I have one more month to work on this and I'm months behind 
schedule because I have been doing free support way too much lately.

            I really hope Julian will be able to help ... he's way more into 
the details of Netty than I am (cause he's got the book ;-) )

            So Julian? ... it would be super awesome if you could take on this 
issue.

            Chris



            Am 24.08.20, 00:17 schrieb "Adam Rossi" 
<ac.ro...@gmail.com<mailto:ac.ro...@gmail.com>>:

                Thanks, I did test with 0.8.0-SNAPSHOT and see the same 
behavior. In every
                plcConnection a nioEventLoopGroup thread is created and does 
not ever seem
                to be destroyed.

                I wish I understood the io.netty.channel.EventLoopGroup class 
better to be
                more helpful here. Would an example program that reproduces 
this thread
                leak be useful?

                jconsole shows the thread data as:

                Name: nioEventLoopGroup-19-1
                State: RUNNABLE
                Total blocked: 0  Total waited: 0

                Stack trace:
                
java.base@13.0.2/sun.nio.ch<http://sun.nio.ch>.EPoll.wait(Native Method)
                java.base@13.0.2
                
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
                java.base@13.0.2
                /sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
                   - locked io.netty.channel.nio.SelectedSelectionKeySet@1838f97
                   - locked sun.nio.ch.EPollSelectorImpl@1f49287
                
java.base@13.0.2/sun.nio.ch<http://sun.nio.ch>.SelectorImpl.select(SelectorImpl.java:141)
                
app//io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
                
app//io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:803)
                
app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
                
app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
                
app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
                
app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
                java.base@13.0.2/java.lang.Thread.run(Thread.java:830)


                Regards, Adam

                On Sun, Aug 23, 2020 at 1:00 PM Christofer Dutz 
<christofer.d...@c-ware.de<mailto:christofer.d...@c-ware.de>>
                wrote:

                > Hi Adam,
                >
                > the Apache SNAPSHOT repo is at:
                > https://repository.apache.org/content/repositories/snapshots
                >
                > Adding this to your pom should help:
                >
                >   <!-- Make Snapshots of Apache projects available -->
                >   <repositories>
                >     <repository>
                >       <id>apache-snapshots</id>
                >       
<url>https://repository.apache.org/content/repositories/snapshots
                > </url>
                >       <releases>
                >         <enabled>false</enabled>
                >       </releases>
                >       <snapshots>
                >         <enabled>true</enabled>
                >       </snapshots>
                >     </repository>
                >   </repositories>
                >
                >   <!-- Make Snapshots of Apache plugins available -->
                >   <pluginRepositories>
                >     <pluginRepository>
                >       <id>apache-snapshots</id>
                >       
<url>https://repository.apache.org/content/repositories/snapshots
                > </url>
                >       <releases>
                >         <enabled>false</enabled>
                >       </releases>
                >       <snapshots>
                >         <enabled>true</enabled>
                >       </snapshots>
                >     </pluginRepository>
                >   </pluginRepositories>
                >
                > Chris
                > Am 23.08.20, 18:56 schrieb "Adam Rossi" 
<ac.ro...@gmail.com<mailto:ac.ro...@gmail.com>>:
                >
                >     Sure thing. Is 0.8.0-snapshot hosted anywhere or is that 
something
                > that needs to be built? Regards Adam
                >
                >     > On Aug 23, 2020, at 12:31 PM, Christofer Dutz <
                > christofer.d...@c-ware.de<mailto:christofer.d...@c-ware.de>> 
wrote:
                >     >
                >     > Hmm ...
                >     >
                >     > Could you possibly give 0.8.0-SNAPSHOT a try? .... or 
0.6.x? ...
                > 0.7.0 was the first of the new generation drivers. We're 
maintaining the
                > 0.6 branch and working hard on making the new generation 
drivers 100%
                > production ready.
                >     >
                >     > Chris
                >     >
                >     >
                >     > Am 23.08.20, 18:06 schrieb "Adam Rossi" 
<ac.ro...@gmail.com<mailto:ac.ro...@gmail.com>>:
                >     >
                >     >    This is the latest 0.7.0 release from Maven.
                >     >
                >     >
                >     >    Regards Adam
                >     >
                >     >> On Aug 23, 2020, at 11:56 AM, Christofer Dutz <
                > christofer.d...@c-ware.de<mailto:christofer.d...@c-ware.de>> 
wrote:
                >     >>
                >     >> Hi Adam,
                >     >>
                >     >> which version of PLC4X are you using? I know we had 
similar reports
                > some time ago, but had thought we had fixed them
                >     >>
                >     >> Chris
                >     >>
                >     >>
                >     >>
                >     >> Am 23.08.20, 16:40 schrieb "Adam Rossi" 
<ac.ro...@gmail.com<mailto:ac.ro...@gmail.com>>:
                >     >>
                >     >>   Howdy. I am seeing a persistent thread being created 
for every
                >     >>   plcConnection connect which looks like the following:
                >     >>
                >     >>   Name: nioEventLoopGroup-11-1
                >     >>   State: RUNNABLE
                >     >>   Total blocked: 0  Total waited: 0
                >     >>
                >     >>   Stack trace:
                >     >>   
java.base@13.0.2/sun.nio.ch<http://sun.nio.ch>.EPoll.wait(Native Method)
                >     >>   java.base@13.0.2
                >     >>   
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
                >     >>   java.base@13.0.2
                >     >>   
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
                >     >>      - locked 
io.netty.channel.nio.SelectedSelectionKeySet@f6df6b
                >     >>      - locked sun.nio.ch.EPollSelectorImpl@7ffc19
                >     >>   java.base@13.0.2/sun.nio.ch<http://sun.nio.ch>
                > .SelectorImpl.select(SelectorImpl.java:141)
                >     >>
                >  
app//io.netty.channel.nio.SelectedSelectionKeySetSelector.select(SelectedSelectionKeySetSelector.java:68)
                >     >>
                >  
app//io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:803)
                >     >>   
app//io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:457)
                >     >>
                >  
app//io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
                >     >>
                >  
app//io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
                >     >>
                >  
app//io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
                >     >>   
java.base@13.0.2/java.lang.Thread.run(Thread.java:830)
                >     >>
                >     >>
                >     >>   This happens both when creating a connection 
directly or when
                > using the
                >     >>   plc4x connection pool.
                >     >>
                >     >>   In both cases I am obtaining the connection, reading 
some modbus
                > values,
                >     >>   and then closing the connection. From the 
plcConnection metadata
                > the
                >     >>   connection type I am using is:
                >     >>
                >     >>   
org.apache.plc4x.java.spi.connection.DefaultNettyPlcConnection
                >     >>
                >     >>   Over time thousands of these nioEventLoopGroup 
threads are
                > created and I
                >     >>   experience an OutOfMemory error in my program.
                >     >>
                >     >>   Has anyone experienced this problem? Is there a way 
to force
                > these threads
                >     >>   to close, timeout, or to specify the total thread 
limit?
                >     >>
                >     >>   Thanks and Regards, Adam
                >     >>
                >     >
                >
                >

Re: Leaking nioEventLoopGroup threads

Reply via email to