Hi,
can you test with this patch ?
diff --git
a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
index 50ebd4e..575b2f4 100644
---
a/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
+++
b/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java
@@ -695,8 +695,9 @@
for (Iterator<S> i = allSessions(); i.hasNext();) {
IoSession session = i.next();
+ scheduleRemove((S) session);
+
if (session.isActive()) {
- scheduleRemove((S) session);
hasKeys = true;
}
}
Le 11/10/2017 à 12:04, Christoph John a écrit :
> Hi,
>
> thanks for the patch. I am using 2.0.16.
> Oddly enough when I run all MINA tests then the ConnectorTest is
> hanging on my machine in the testTCPWithSSL method. But I don't know
> if this is related. (stack trace attached)
> However, I will try out your patch and let you know.
>
> Thanks again,
> Chris.
>
>
> On 10/10/17 20:47, Jonathan Valliere wrote:
>> Which version of Mina are you using or are you building from Git?
>>
>> Please pull tag/2.0.16 from GIT and apply the attached patch. Let me
>> know if that fixes your problem. Sorry about the excess changes in
>> the patch; the java code formatter made a lot of changes. If this
>> works then we can create a JIRA bug.
>>
>> On Tue, Oct 10, 2017 at 4:49 AM, Christoph John
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>> Hi,
>>
>> thanks for your reply.
>> In fact it is hanging forever, i.e. until the process stops. I
>> have attached the original
>> message I've sent to the mailing list. It only does occur
>> sometimes for SSL connections with a
>> failing handshake.
>> Unfortunately I have no reproducable example for MINA itself. I
>> could probably put something
>> together for QuickFIX/J (the open source project I am working on).
>>
>> My OS is Ubuntu 14.04.5, JDK1.8_144 and the problem appears not
>> so often on my machine but
>> almost every time on the TravisCI build server
>> (https://travis-ci.org/quickfix-j/quickfixj/builds/283210509
>> <https://travis-ci.org/quickfix-j/quickfixj/builds/283210509>).
>> As a result, some of the SSL
>> related tests are failing. TravisCI has almost similar setup with
>> JDK1.8_144 and Debian Linux.
>>
>> What would be a good starting point to create a test? I see that
>> there is an SslTest in the
>> mina-core module. So I probably have to change that test to
>> repeatedly connect and get a
>> handshake exception everytime and then take a number of stack
>> traces.
>>
>> Thanks,
>> Chris.
>>
>>
>>
>>
>>
>> On 09/10/17 14:51, Jonathan Valliere wrote:
>>> What OS / Java Version / etc; Do you have a reproducible example?
>>>
>>> On Mon, Oct 9, 2017 at 8:34 AM, Jonathan Valliere
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Let me know if its hanging more than 1s
>>>
>>> On Mon, Oct 9, 2017 at 5:08 AM, Christoph John
>>> <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Hi,
>>>
>>> I have another question regarding this one. There is
>>> https://issues.apache.org/jira/browse/DIRMINA-1060
>>> <https://issues.apache.org/jira/browse/DIRMINA-1060>
>>> which also sounds a little like
>>> the problem I'm having. When the connectors are hanging
>>> in the call to dispose() then
>>> there always is an accompanying NioProcessor which is
>>> hanging in select().
>>>
>>> Example:
>>> "NioProcessor-60" #100328 prio=5 os_prio=0
>>> tid=0x00007f2a10003000 nid=0x2e71 runnable
>>> [0x00007f2a388b1000]
>>> java.lang.Thread.State: RUNNABLE
>>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native
>>> Method)
>>> at
>>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>>> at
>>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>>> at
>>> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>>> - locked <0x00000000e239c118> (a sun.nio.ch.Util$3)
>>> - locked <0x00000000e239c108> (a
>>> java.util.Collections$UnmodifiableSet)
>>> - locked <0x00000000e239bed0> (a
>>> sun.nio.ch.EPollSelectorImpl)
>>> at
>>> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>>> at
>>> org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
>>> at
>>>
>>> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
>>> at
>>>
>>> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> "NioSocketConnector-38" #100326 prio=5 os_prio=0
>>> tid=0x00007f2a3001d800 nid=0x2e6f in
>>> Object.wait() [0x00007f2a1f2d3000]
>>> java.lang.Thread.State: TIMED_WAITING (on object
>>> monitor)
>>> at java.lang.Object.wait(Native Method)
>>> at org.apache.mina.core.future.De
>>>
>>> <http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
>>> - locked <0x00000000e246ae08> (a
>>> org.apache.mina.core.future.De
>>> <http://org.apache.mina.core.future.De>faultIoFuture)
>>> at org.apache.mina.core.future.De
>>>
>>> <http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
>>> at
>>>
>>> org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
>>> at
>>>
>>> org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
>>> - locked <0x00000000e246ae40> (a java.lang.Object)
>>> at
>>>
>>> org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
>>> at
>>>
>>> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>>> at
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> At first I thought that this was related to
>>> https://issues.apache.org/jira/browse/DIRMINA-1059
>>> <https://issues.apache.org/jira/browse/DIRMINA-1059>. In
>>> that ticket the
>>> synchronization was improved. However, I am also running
>>> into the problem with a
>>> build of 2.0.17-SNAPSHOT where DIRMINA-1059 was solved.
>>>
>>> So my only hope was DIRMINA-1060 ;) Could this improve
>>> the situation?
>>>
>>> Thanks,
>>> Chris.
>>>
>>>
>>> -- Christoph John
>>> Development & Support
>>> Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
>>> Mailto:[email protected]
>>> <mailto:[email protected]>
>>>
>>>
>>>
>>> http://www.macd.com <http://www.macd.com/>
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>>
>>> ----------------------------------------------------------------------------------------------------
>>> MACD GmbH
>>> Oppenhoffallee 103
>>> <https://maps.google.com/?q=Oppenhoffallee+103&entry=gmail&source=g>
>>> D-52066 Aachen
>>> Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> |
>>> Fax: +49 241 557080-10
>>> <tel:%2B49%20241%20557080-10>
>>> Amtsgericht Aachen: HRB 8151
>>> Ust.-Id: DE 813021663
>>>
>>> Geschäftsführer: George Macdonald
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>>
>>> ----------------------------------------------------------------------------------------------------
>>>
>>> take care of the environment - print only if necessary
>>>
>>>
>>>
>>
>> -- Christoph John
>> Development & Support
>> Direct: +49 241 557080-28 <tel:+49%20241%2055708028>
>> Mailto:[email protected]
>>
>>
>>
>> http://www.macd.com <http://www.macd.com/>
>>
>> ----------------------------------------------------------------------------------------------------
>>
>>
>> ----------------------------------------------------------------------------------------------------
>> MACD GmbH
>> Oppenhoffallee 103
>> D-52066 Aachen
>> Tel: +49 241 557080-0 | Fax: +49 241 557080-10
>> Amtsgericht Aachen: HRB 8151
>> Ust.-Id: DE 813021663
>>
>> Geschäftsführer: George Macdonald
>>
>> ----------------------------------------------------------------------------------------------------
>>
>>
>> ----------------------------------------------------------------------------------------------------
>>
>> take care of the environment - print only if necessary
>>
>>
>> ---------- Forwarded message ----------
>> From: Christoph John <[email protected]
>> <mailto:[email protected]>>
>> To: [email protected] <mailto:[email protected]>
>> Cc:
>> Bcc:
>> Date: Wed, 26 Jul 2017 13:59:58 +0200
>> Subject: leaking NioProcessors/NioSocketConnectors hanging in
>> call to dispose
>> Hi,
>>
>> I am a developer and maintainer of the QuickFIX/J project
>> (https://github.com/quickfix-j/quickfixj
>> <https://github.com/quickfix-j/quickfixj>) and I have
>> a question regarding NioSocketConnectors.
>>
>> We are facing a problem when there is a process that constantly
>> (every 30 seconds) tries to
>> connect to a counterparty and the connection is established but
>> dropped shortly after. Then
>> sometimes the NioProcessors/NioSocketConnectors are not cleaned
>> up properly. In the stack
>> trace we see them hanging in a call to dispose:
>>
>> "NioProcessor-1140" #239 prio=5 os_prio=0 tid=0x0000000001fe1800
>> nid=0x2523 runnable
>> [0x00007f9c67e8f000]
>> java.lang.Thread.State: RUNNABLE
>> at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>> at
>> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
>> at
>> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
>> at
>> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
>> - locked <0x00000000f6699e60> (a sun.nio.ch.Util$3)
>> - locked <0x00000000f6699e50> (a
>> java.util.Collections$UnmodifiableSet)
>> - locked <0x00000000f6699c18> (a
>> sun.nio.ch.EPollSelectorImpl)
>> at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
>> at
>> org.apache.mina.transport.socket.nio.NioProcessor.select(NioProcessor.java:98)
>> at
>>
>> org.apache.mina.core.polling.AbstractPollingIoProcessor$Processor.run(AbstractPollingIoProcessor.java:1075)
>> at
>> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:748)
>>
>> "NioSocketConnector-68" #238 prio=5 os_prio=0
>> tid=0x00007f9c70caf000 nid=0x2522 in
>> Object.wait() [0x00007f9c6af9f000]
>> java.lang.Thread.State: TIMED_WAITING (on object monitor)
>> at java.lang.Object.wait(Native Method)
>> at org.apache.mina.core.future.De
>>
>> <http://org.apache.mina.core.future.De>faultIoFuture.await0(DefaultIoFuture.java:209)
>> - locked <0x00000000f66ac718> (a
>> org.apache.mina.core.future.De
>> <http://org.apache.mina.core.future.De>faultIoFuture)
>> at org.apache.mina.core.future.De
>>
>> <http://org.apache.mina.core.future.De>faultIoFuture.awaitUninterruptibly(DefaultIoFuture.java:141)
>> at
>>
>> org.apache.mina.core.polling.AbstractPollingIoProcessor.dispose(AbstractPollingIoProcessor.java:188)
>> at
>>
>> org.apache.mina.core.service.SimpleIoProcessorPool.dispose(SimpleIoProcessorPool.java:329)
>> - locked <0x00000000f66ac750> (a java.lang.Object)
>> at
>>
>> org.apache.mina.core.polling.AbstractPollingIoConnector$Connector.run(AbstractPollingIoConnector.java:582)
>> at
>> org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:64)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> at java.lang.Thread.run(Thread.java:748)
>>
>> It does not happen very often: about 5% of the connection
>> attempts leave a NioSocketConnector
>> hanging.
>> It only seems to happen though when the connection is
>> disconnected by
>> "javax.net.ssl.SSLHandshakeException: SSL handshake failed".
>> Although there are cases when
>> there is no leak even on an SSLHandshakeException.
>> If the connection was reset "normally" by "java.io.IOException:
>> Connection reset by peer" then
>> the leak does not seem to occur. It also does not occur when the
>> connection is refused right away.
>>
>> Since this seems to be related to SSL connections: is there
>> something that we need to take
>> care of when using the SSL filter?
>>
>> The code for the IoSessionInitiator can be found here:
>>
>> https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java
>>
>> <https://github.com/quickfix-j/quickfixj/blob/master/quickfixj-core/src/main/java/quickfix/mina/initiator/IoSessionInitiator.java>
>> I have added some comments in this gist (starting with "chrjohn"):
>> https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b
>> <https://gist.github.com/chrjohn/2671f06d80e8d917d9061b573477ec5b>
>>
>> I cannot rule out that we might be doing something wrong here, so
>> any pointer is appreciated. :)
>>
>> Thanks in advance for your help and best regards,
>> Chris.
>>
>> -- Christoph John
>> Development & Support
>> Direct: +49 241 557080-28 <tel:%2B49%20241%20557080-28>
>> Mailto:[email protected] <mailto:[email protected]>
>>
>>
>>
>> http://www.macd.com <http://www.macd.com/>
>>
>> ----------------------------------------------------------------------------------------------------
>>
>>
>> ----------------------------------------------------------------------------------------------------
>> MACD GmbH
>> Oppenhoffallee 103
>> D-52066 Aachen
>> Tel: +49 241 557080-0 <tel:%2B49%20241%20557080-0> | Fax: +49 241
>> 557080-10
>> <tel:%2B49%20241%20557080-10>
>> Amtsgericht Aachen: HRB 8151
>> Ust.-Id: DE 813021663
>>
>> Geschäftsführer: George Macdonald
>>
>> ----------------------------------------------------------------------------------------------------
>>
>>
>> ----------------------------------------------------------------------------------------------------
>>
>> take care of the environment - print only if necessary
>>
>>
>
--
Emmanuel Lecharny
Symas.com
directory.apache.org