Hi,
thanks for your reply.
In fact it is hanging forever, i.e. until the process stops. I have attached the original message
I've sent to the mailing list. It only does occur sometimes for SSL connections with a failing
handshake.
Unfortunately I have no reproducable example for MINA itself. I could probably put something
together for QuickFIX/J (the open source project I am working on).
My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not so often on my machine but almost
every time on the TravisCI build server
(https://travis-ci.org/quickfix-j/quickfixj/builds/283210509). As a result, some of the SSL related
tests are failing. TravisCI has almost similar setup with JDK1.8_144 and Debian Linux.
What would be a good starting point to create a test? I see that there is an SslTest in the
mina-core module. So I probably have to change that test to repeatedly connect and get a handshake
exception everytime and then take a number of stack traces.
Thanks,
Chris.
On 09/10/17 14:51, Jonathan Valliere wrote:
What OS / Java Version / etc; Do you have a reproducible example?
On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere <jon.valli...@emoten.com
<mailto:jon.valli...@emoten.com>> wrote:
Let me know if its hanging more than 1s
On Mon, Oct 9, 2017 at 5:08 AM, Christoph John <christoph.j...@macd.com
<mailto:christoph.j...@macd.com>> wrote:
Hi,
I have another question regarding this one. There is
https://issues.apache.org/jira/browse/DIRMINA-1060
<https://issues.apache.org/jira/browse/DIRMINA-1060> which also sounds
a little like the
problem I'm having. When the connectors are hanging in the call to
dispose() then there
always is an accompanying NioProcessor which is hanging in select().
Example:
"NioProcessor-60" #100328 prio=5 os_prio=0 tid=0x00007f2a10003000
nid=0x2e71 runnable
[0x00007f2a388b1000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000e239c118> (a sun.nio.ch.Util$3)
- locked <0x00000000e239c108> (a
java.util.Collections$UnmodifiableSet)
- locked <0x00000000e239bed0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"NioSocketConnector-38" #100326 prio=5 os_prio=0 tid=0x00007f2a3001d800
nid=0x2e6f in
Object.wait() [0x00007f2a1f2d3000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
- locked <0x00000000e246ae08> (a org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture)
at org.apache.mina.core.future.De
<http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
- locked <0x00000000e246ae40> (a java.lang.Object)
at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
At first I thought that this was related to
https://issues.apache.org/jira/browse/DIRMINA-1059
<https://issues.apache.org/jira/browse/DIRMINA-1059>. In that ticket
the synchronization
was improved. However, I am also running into the problem with a build
of 2.0.17-SNAPSHOT
where DIRMINA-1059 was solved.
So my only hope was DIRMINA-1060 ;) Could this improve the situation?
Thanks,
Chris.
--
Christoph John
Development & Support
Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
Mailto:christoph.j...@macd.com <mailto:christoph.j...@macd.com>
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
<https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g>
D-52066 Aachen
Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> | Fax: +49 241
557080-10
<tel:%2B49%20241%20557080-10>
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary
--
Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:christoph.j...@macd.com
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary
--- Begin Message ---
Hi,
I am a developer and maintainer of the QuickFIX/J project (https://github.com/quickfix-j/quickfixj)
and I have a question regarding NioSocketConnectors.
We are facing a problem when there is a process that constantly (every 30 seconds) tries to connect
to a counterparty and the connection is established but dropped shortly after. Then sometimes the
NioProcessors/NioSocketConnectors are not cleaned up properly. In the stack trace we see them
hanging in a call to dispose:
"NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800 nid=0x2523 runnable
[0x00007f9c67e8f000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000f6699e60> (a sun.nio.ch.Util$3)
- locked <0x00000000f6699e50> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000f6699c18> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at
org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
"NioSocketConnector-68" #238 prio=5 os_prio=0 tid=0x00007f9c70caf000 nid=0x2522 in Object.wait()
[0x00007f9c6af9f000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at
org.apache.mina.core.future.DefaultIoFuture.await0(DefaultIoFuture.java:209)
- locked <0x00000000f66ac718> (a
org.apache.mina.core.future.DefaultIoFuture)
at
org.apache.mina.core.future.DefaultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
at
org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
at
org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
- locked <0x00000000f66ac750> (a java.lang.Object)
at
org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
at
org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
It does not happen very often: about 5% of the connection attempts leave a
NioSocketConnector hanging.
It only seems to happen though when the connection is disconnected by
"javax.net.ssl.SSLHandshakeException: SSL handshake failed". Although there are cases when there is
no leak even on an SSLHandshakeException.
If the connection was reset "normally" by "java.io.IOException: Connection reset by peer" then the
leak does not seem to occur. It also does not occur when the connection is refused right away.
Since this seems to be related to SSL connections: is there something that we need to take care of
when using the SSL filter?
The code for the IoSessionInitiator can be found here:
https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java
I have added some comments in this gist (starting with "chrjohn"):
https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b
I cannot rule out that we might be doing something wrong here, so any pointer
is appreciated. :)
Thanks in advance for your help and best regards,
Chris.
--
Christoph John
Development & Support
Direct: +49 241 557080-28
Mailto:christoph.j...@macd.com
http://www.macd.com <http://www.macd.com/>
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
MACD GmbH
Oppenhoffallee 103
D-52066 Aachen
Tel: +49 241 557080-0 | Fax: +49 241 557080-10
Amtsgericht Aachen: HRB 8151
Ust.-Id: DE 813021663
Geschäftsführer: George Macdonald
----------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------
take care of the environment - print only if necessary
--- End Message ---