Which version of Mina are you using or are you building from Git?

Please pull tag/2.0.16 from GIT and apply the attached patch.  Let me know
if that fixes your problem.  Sorry about the excess changes in the patch;
the java code formatter made a lot of changes. If this works then we can
create a JIRA bug.

On Tue, Oct 10, 2017 at 4:49 AM, Christoph John <christoph.j...@macd.com>
wrote:

> Hi,
>
> thanks for your reply.
> In fact it is hanging forever, i.e. until the process stops. I have
> attached the original message I've sent to the mailing list. It only does
> occur sometimes for SSL connections with a failing handshake.
> Unfortunately I have no reproducable example for MINA itself. I could
> probably put something together for QuickFIX/J (the open source project I
> am working on).
>
> My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not so often
> on my machine but almost every time on the TravisCI build server (
> https://travis-ci.org/quickfix-j/quickfixj/builds/283210509). As a
> result, some of the SSL related tests are failing. TravisCI has almost
> similar setup with JDK1.8_144 and Debian Linux.
>
> What would be a good starting point to create a test? I see that there is
> an SslTest in the mina-core module. So I probably have to change that test
> to repeatedly connect and get a handshake exception everytime and then take
> a number of stack traces.
>
> Thanks,
> Chris.
>
>
>
>
>
> On 09/10/17 14:51, Jonathan Valliere wrote:
>
> What OS / Java Version / etc;  Do you have a reproducible example?
>
> On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere <jon.valli...@emoten.com
> > wrote:
>
>> Let me know if its hanging more than 1s
>>
>> On Mon, Oct 9, 2017 at 5:08 AM, Christoph John <christoph.j...@macd.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have another question regarding this one. There is
>>> https://issues.apache.org/jira/browse/DIRMINA-1060 which also sounds a
>>> little like the problem I'm having. When the connectors are hanging in the
>>> call to dispose() then there always is an accompanying NioProcessor which
>>> is hanging in select().
>>>
>>> Example:
>>> "NioProcessor-60" #100328 prio=5 os_prio=0 tid=0x00007f2a10003000
>>> nid=0x2e71 runnable [0x00007f2a388b1000]
>>>    java.lang.Thread.State: RUNNABLE
>>>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java
>>> :93)
>>>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>>>         - locked <0x00000000e239c118> (a sun.nio.ch.Util$3)
>>>         - locked <0x00000000e239c108> (a java.util.Collections$Unmodifi
>>> ableSet)
>>>         - locked <0x00000000e239bed0> (a sun.nio.ch.EPollSelectorImpl)
>>>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>>>         at org.apache.mina.transport.socket.nio.NioProcessor.select(Nio
>>> Processor.java:98)
>>>         at org.apache.mina.core.polling.AbstractPollingIoProcessor$Proc
>>> essor.run(AbstractPollingIoProcessor.java:1075)
>>>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi
>>> ngRunnable.java:64)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1149)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:624)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> "NioSocketConnector-38" #100326 prio=5 os_prio=0 tid=0x00007f2a3001d800
>>> nid=0x2e6f in Object.wait() [0x00007f2a1f2d3000]
>>>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>         at java.lang.Object.wait(Native Method)
>>>         at org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIo
>>> Future.java:209)
>>>         - locked <0x00000000e246ae08> (a org.apache.mina.core.future.De
>>> faultIoFuture)
>>>         at org.apache.mina.core.future.DefaultIoFuture.awaitUninterrupt
>>> ibly(DefaultIoFuture.java:141)
>>>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.disp
>>> ose(AbstractPollingIoProcessor.java:188)
>>>         at org.apache.mina.core.service.SimpleIoProcessorPool.dispose(S
>>> impleIoProcessorPool.java:329)
>>>         - locked <0x00000000e246ae40> (a java.lang.Object)
>>>         at org.apache.mina.core.polling.AbstractPollingIoConnector$Conn
>>> ector.run(AbstractPollingIoConnector.java:582)
>>>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi
>>> ngRunnable.java:64)
>>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
>>> Executor.java:1149)
>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
>>> lExecutor.java:624)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> At first I thought that this was related to
>>> https://issues.apache.org/jira/browse/DIRMINA-1059. In that ticket the
>>> synchronization was improved. However, I am also running into the problem
>>> with a build of 2.0.17-SNAPSHOT where DIRMINA-1059 was solved.
>>>
>>> So my only hope was DIRMINA-1060 ;) Could this improve the situation?
>>>
>>> Thanks,
>>> Chris.
>>>
>>>
>>> --
>>> Christoph John
>>> Development & Support
>>> Direct: +49 241 557080-28
>>> Mailto:christoph.j...@macd.com
>>>
>>>
>>>
>>> http://www.macd.com <http://www.macd.com/>
>>> ------------------------------------------------------------
>>> ----------------------------------------
>>>
>>> ------------------------------------------------------------
>>> ----------------------------------------
>>> MACD GmbH
>>> Oppenhoffallee 103
>>> <https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g>
>>> D-52066 Aachen
>>> Tel: +49 241 557080-0 | Fax: +49 241 557080-10
>>>          Amtsgericht Aachen: HRB 8151
>>> Ust.-Id: DE 813021663
>>>
>>> Geschäftsführer: George Macdonald
>>> ------------------------------------------------------------
>>> ----------------------------------------
>>>
>>> ------------------------------------------------------------
>>> ----------------------------------------
>>>
>>> take care of the environment - print only if necessary
>>>
>>
>>
>
> --
> Christoph John
> Development & Support
> Direct: +49 241 557080-28 <+49%20241%2055708028>
> Mailto:christoph.j...@macd.com <christoph.j...@macd.com>
>
>
>
> http://www.macd.com
> ------------------------------
> ------------------------------
> MACD GmbH
> Oppenhoffallee 103
> D-52066 Aachen
> Tel: +49 241 557080-0 | Fax: +49 241 557080-10
>  Amtsgericht Aachen: HRB 8151
> Ust.-Id: DE 813021663
>
> Geschäftsführer: George Macdonald
> ------------------------------
> ------------------------------
> take care of the environment - print only if necessary
>
>
> ---------- Forwarded message ----------
> From: Christoph John <christoph.j...@macd.com>
> To: dev@mina.apache.org
> Cc:
> Bcc:
> Date: Wed, 26 Jul 2017 13:59:58 +0200
> Subject: leaking NioProcessors/NioSocketConnectors hanging in call to
> dispose
> Hi,
>
> I am a developer and maintainer of the QuickFIX/J project (
> https://github.com/quickfix-j/quickfixj) and I have a question regarding
> NioSocketConnectors.
>
> We are facing a problem when there is a process that constantly (every 30
> seconds) tries to connect to a counterparty and the connection is
> established but dropped shortly after. Then sometimes the
> NioProcessors/NioSocketConnectors are not cleaned up properly. In the
> stack trace we see them hanging in a call to dispose:
>
> "NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800
> nid=0x2523 runnable [0x00007f9c67e8f000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>         at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>         at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java
> :93)
>         at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>         - locked <0x00000000f6699e60> (a sun.nio.ch.Util$3)
>         - locked <0x00000000f6699e50> (a java.util.Collections$Unmodifi
> ableSet)
>         - locked <0x00000000f6699c18> (a sun.nio.ch.EPollSelectorImpl)
>         at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>         at org.apache.mina.transport.socket.nio.NioProcessor.select(Nio
> Processor.java:98)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor$Proc
> essor.run(AbstractPollingIoProcessor.java:1075)
>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi
> ngRunnable.java:64)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748)
>
> "NioSocketConnector-68" #238 prio=5 os_prio=0 tid=0x00007f9c70caf000
> nid=0x2522 in Object.wait() [0x00007f9c6af9f000]
>    java.lang.Thread.State: TIMED_WAITING (on object monitor)
>         at java.lang.Object.wait(Native Method)
>         at org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIo
> Future.java:209)
>         - locked <0x00000000f66ac718> (a org.apache.mina.core.future.De
> faultIoFuture)
>         at org.apache.mina.core.future.DefaultIoFuture.awaitUninterrupt
> ibly(DefaultIoFuture.java:141)
>         at org.apache.mina.core.polling.AbstractPollingIoProcessor.disp
> ose(AbstractPollingIoProcessor.java:188)
>         at org.apache.mina.core.service.SimpleIoProcessorPool.dispose(S
> impleIoProcessorPool.java:329)
>         - locked <0x00000000f66ac750> (a java.lang.Object)
>         at org.apache.mina.core.polling.AbstractPollingIoConnector$Conn
> ector.run(AbstractPollingIoConnector.java:582)
>         at org.apache.mina.util.NamePreservingRunnable.run(NamePreservi
> ngRunnable.java:64)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:748)
>
> It does not happen very often: about 5% of the connection attempts leave a
> NioSocketConnector hanging.
> It only seems to happen though when the connection is disconnected by
> "javax.net.ssl.SSLHandshakeException: SSL handshake failed". Although
> there are cases when there is no leak even on an SSLHandshakeException.
> If the connection was reset "normally" by "java.io.IOException: Connection
> reset by peer" then the leak does not seem to occur. It also does not occur
> when the connection is refused right away.
>
> Since this seems to be related to SSL connections: is there something that
> we need to take care of when using the SSL filter?
>
> The code for the IoSessionInitiator can be found here:
> https://github.com/quickfix-j/quickfixj/blob/master/quickfix
> j-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java
> I have added some comments in this gist (starting with "chrjohn"):
> https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b
>
> I cannot rule out that we might be doing something wrong here, so any
> pointer is appreciated. :)
>
> Thanks in advance for your help and best regards,
> Chris.
>
> --
> Christoph John
> Development & Support
> Direct: +49 241 557080-28
> Mailto:christoph.j...@macd.com
>
>
>
> http://www.macd.com <http://www.macd.com/>
> ------------------------------------------------------------
> ----------------------------------------
>
> ------------------------------------------------------------
> ----------------------------------------
> MACD GmbH
> Oppenhoffallee 103
> D-52066 Aachen
> Tel: +49 241 557080-0 | Fax: +49 241 557080-10
>          Amtsgericht Aachen: HRB 8151
> Ust.-Id: DE 813021663
>
> Geschäftsführer: George Macdonald
> ------------------------------------------------------------
> ----------------------------------------
>
> ------------------------------------------------------------
> ----------------------------------------
>
> take care of the environment - print only if necessary
>
>

Reply via email to