[
https://issues.apache.org/jira/browse/DIRMINA-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15403441#comment-15403441
]
Emmanuel Lecharny commented on DIRMINA-1021:
--------------------------------------------
Ok, I think I see a scenario where the {{CloseFuture}} actually never get
terminated.
The {{writeBuffer}} call stack is like :
{noformat}
Processor.run
flush (OPENED)
flushNow
writeBuffer
{noformat}
If one exception occurs in the {{writeBuffer}}, we do try to close the session :
{noformat}
} catch (IOException ioe) {
// We have had an issue while trying to send data to the
// peer : let's close the session.
buf.free();
session.closeNow();
destroy(session);
return 0;
}
{noformat}
At this point, the {{CloseFuture}} that is returned by the
{{session.closeNow()}} call is simply not kept anywhere (which is not
necessarily an issue, because we intend to close the session immediately), and
it will be terminated by a call to the {{fireSessionClosed()}} method, itself
called by the {{fireSessionDestroyed()}} method, called by the {{removeNow()}}
method, called by the {{removeSessions()}} method, ultimately called by the
main {{Processor.run()}} method. It requires the session to have been added to
the {{removingSessions}} queue, which should be the case as we called the
{{closeNow()}} method. However, and you pointed it out before, the session
might be in a {{CLOSING}} state, and in this case, we never call the
{{removeNow()}} method, which call the {{fireSessionDestroyed()}} method...
At this poiint, I think that modifying the {{writeBuffer}} method to do :
{noformat}
...
try {
localWrittenBytes = write(session, buf, length);
} catch (IOException ioe) {
// We have had an issue while trying to send data to the
// peer : let's close the session.
buf.free();
session.closeNow();
removeNow(session);
return 0;
}
...
{noformat}
should pretty much cover this corner case. I have attached the latest build
containing this fix, would you like to test it ?
> MINA-CORE does not remove sessions if exceptions occur while closing
> --------------------------------------------------------------------
>
> Key: DIRMINA-1021
> URL: https://issues.apache.org/jira/browse/DIRMINA-1021
> Project: MINA
> Issue Type: Bug
> Affects Versions: 2.0.8
> Environment: mina-ssh 0.14.0
> mina-core 2.0.8
> Multiple OSes / Java configurations:
> * Mac OS X El Capitan on Java 8 (1.8.0_60)
> * CentOS 6.4 on Java 8 (1.8.0_60)
> * CentOS 6.5 on Java 8 (1.8.0_20-b26).
> Reporter: Doug Kelly
> Attachments: attempt-removing-sessions-closing.patch,
> mina-core-2.0.14-SNAPSHOT.jar
>
>
> MINA SSHD isn't removing sessions when using the MINA/NIO backend if an
> exception as received as the session is closing (such as a connection reset
> is received with data still in the write buffer). When this case happens, it
> seems that {{NioProcessor.getState}} returns the state as {{CLOSING}} (I'm
> assuming since the underlying channel is now closed), which means that the
> {{AbstractPollingIoProcessor.removeSessions()}} won't ever prune the session,
> since a {{CLOSING}} state is simply ignored. The result is a resource leak
> over time, since these sessions are never pruned (it's a slow leak, since
> entering this condition is racy – on my workstation, I can produce it through
> randomly interrupting connections anywhere from 1/6 to 1/10th of the time).
> (This may either be major or critical; reprioritize as necessary.)
> I specifically see this error with Gerrit 2.10.4 and Gerrit 2.11.5 (using
> mina-sshd 0.14.0 / mina-core 2.0.8), and it looks like the code path is
> unchanged in mina-sshd 1.0.0 / mina-core 2.0.9. I was unsure if this is
> specifically a bug in mina-core or, if it's something unique to mina-sshd. My
> local development system runs Mac OS X El Capitan on Java 8 (1.8.0_60), but
> I've also seen this on Linux (CentOS 6.4, again Java 1.8.0_60 and CentOS 6.5
> on Java 1.8.0_20-b26).
> The fix may be as simple as attempting to remove the session if {{OPENED}} or
> {{CLOSING}}, but I'm unsure what side-effects this may have with other
> backends. I'll be happy to test it locally, but I'm fairly ignorant when it
> comes to MINA's code.
> The attached patch (to mina-core) seems to resolve the issue by following the
> reproduction case I have on the [Gerrit issue
> tracker|https://code.google.com/p/gerrit/issues/detail?id=3685].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)