[
https://issues.apache.org/jira/browse/DIRMINA-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389586#comment-15389586
]
Carl Wicklow edited comment on DIRMINA-1021 at 7/22/16 2:25 PM:
----------------------------------------------------------------
Thanks Emmanuel.
I wasn't clear when I said "no exception is thrown". I meant that only in the
context of what happens in setInterestedInWrite().
In terms of the writeBuffer() call, an exception certainly is caught ....
{code:title=AbstractPollingIoProcessor.writeBuffer()|borderStyle=solid}
try {
localWrittenBytes = write(session, buf, length);
} catch (IOException ioe) {
// IOException("Broken Pipe") is caught ...
buf.free();
session.close(true);
destroy(session);
return 0;
}
{code}
The session.close(true) passes up the filter chain (as you showed in an earlier
post), and the session gets scheduled for removal ...
{code:title=AbstractPollingIoProcessor.remove()|borderStyle=solid}
public final void remove(S session) {
scheduleRemove(session);
startupProcessor();
}
private void scheduleRemove(S session) {
removingSessions.add(session);
}
{code}
The destroy(session) called from writeBuffer() then cancels the selection key
for this session (session state is now CLOSING).
{code:title=NioProcessor|borderStyle=solid}
@Override
protected void destroy(NioSession session) throws Exception {
ByteChannel ch = session.getChannel();
SelectionKey key = session.getSelectionKey();
if (key != null) {
key.cancel();
}
ch.close();
}
{code}
The writeBuffer() returns 0, the flushNow returns false from here ...
{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
if (localWrittenBytes == 0) {
setInterestedInWrite(session, true);
return false;
}
{code}
Which pops us back into the processor run loop ...
{code:title=AbstractPollingIoProcessor.Processor.run()|borderStyle=solid}
// Write the pending requests
long currentTime = System.currentTimeMillis();
flush(currentTime);
// And manage removed sessions
nSessions -= removeSessions();
{code}
... ready to remove the sessions. Our session is CLOSING now, though, so the
removal is not propagated ...
{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
case CLOSING:
// Skip if channel is already closed
// In any case, remove the session from the queue
removedSessions++;
break;
{code}
... and that means the CloseFuture never completes, nor does the WriteFuture
for the write that threw the exception, and no exception has been propogated to
the IoHandler.
Am still reviewing the rest (there's a lot to go through !), but I think this
may be a path thru the code where the IoHandler doesn't get told about the
exception.
was (Author: cwicklow):
Thanks Emmanuel.
I wasn't clear when I said "no exception is thrown". I meant that only in the
context of what happens in setInterestedInWrite().
In terms of the writeBuffer() call, an exception certainly is caught ....
{code:title=AbstractPollingIoProcessor.writeBuffer()|borderStyle=solid}
try {
localWrittenBytes = write(session, buf, length);
} catch (IOException ioe) {
// IOException("Broken Pipe") is caught ...
buf.free();
session.close(true);
destroy(session);
return 0;
}
{code}
The session.close(true) passes up the filter chain (as you showed in an earlier
post), and the session gets scheduled for removal ...
{code:title=AbstractPollingIoProcessor.remove()|borderStyle=solid}
public final void remove(S session) {
scheduleRemove(session);
startupProcessor();
}
private void scheduleRemove(S session) {
removingSessions.add(session);
}
{code}
The destroy(session) called from writeBuffer() then cancels the selection key
for this session (session state is now CLOSING).
{code:title=NioProcessor|borderStyle=solid}
@Override
protected void destroy(NioSession session) throws Exception {
ByteChannel ch = session.getChannel();
SelectionKey key = session.getSelectionKey();
if (key != null) {
key.cancel();
}
ch.close();
}
The writeBuffer() returns 0, the flushNow returns false from here ...
{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
if (localWrittenBytes == 0) {
setInterestedInWrite(session, true);
return false;
}
{code}
Which pops us back into the processor run loop ...
{code:title=AbstractPollingIoProcessor.Processor.run()|borderStyle=solid}
// Write the pending requests
long currentTime = System.currentTimeMillis();
flush(currentTime);
// And manage removed sessions
nSessions -= removeSessions();
{code}
... ready to remove the sessions. Our session is CLOSING now, though, so the
removal is not propagated ...
{code:title=AbstractPollingIoProcessor.flushNow()|borderStyle=solid}
case CLOSING:
// Skip if channel is already closed
// In any case, remove the session from the queue
removedSessions++;
break;
{code}
... and that means the CloseFuture never completes, nor does the WriteFuture
for the write that threw the exception, and no exception has been propogated to
the IoHandler.
Am still reviewing the rest (there's a lot to go through !), but I think this
may be a path thru the code where the IoHandler doesn't get told about the
exception.
> MINA-CORE does not remove sessions if exceptions occur while closing
> --------------------------------------------------------------------
>
> Key: DIRMINA-1021
> URL: https://issues.apache.org/jira/browse/DIRMINA-1021
> Project: MINA
> Issue Type: Bug
> Affects Versions: 2.0.8
> Environment: mina-ssh 0.14.0
> mina-core 2.0.8
> Multiple OSes / Java configurations:
> * Mac OS X El Capitan on Java 8 (1.8.0_60)
> * CentOS 6.4 on Java 8 (1.8.0_60)
> * CentOS 6.5 on Java 8 (1.8.0_20-b26).
> Reporter: Doug Kelly
> Attachments: attempt-removing-sessions-closing.patch
>
>
> MINA SSHD isn't removing sessions when using the MINA/NIO backend if an
> exception as received as the session is closing (such as a connection reset
> is received with data still in the write buffer). When this case happens, it
> seems that {{NioProcessor.getState}} returns the state as {{CLOSING}} (I'm
> assuming since the underlying channel is now closed), which means that the
> {{AbstractPollingIoProcessor.removeSessions()}} won't ever prune the session,
> since a {{CLOSING}} state is simply ignored. The result is a resource leak
> over time, since these sessions are never pruned (it's a slow leak, since
> entering this condition is racy – on my workstation, I can produce it through
> randomly interrupting connections anywhere from 1/6 to 1/10th of the time).
> (This may either be major or critical; reprioritize as necessary.)
> I specifically see this error with Gerrit 2.10.4 and Gerrit 2.11.5 (using
> mina-sshd 0.14.0 / mina-core 2.0.8), and it looks like the code path is
> unchanged in mina-sshd 1.0.0 / mina-core 2.0.9. I was unsure if this is
> specifically a bug in mina-core or, if it's something unique to mina-sshd. My
> local development system runs Mac OS X El Capitan on Java 8 (1.8.0_60), but
> I've also seen this on Linux (CentOS 6.4, again Java 1.8.0_60 and CentOS 6.5
> on Java 1.8.0_20-b26).
> The fix may be as simple as attempting to remove the session if {{OPENED}} or
> {{CLOSING}}, but I'm unsure what side-effects this may have with other
> backends. I'll be happy to test it locally, but I'm fairly ignorant when it
> comes to MINA's code.
> The attached patch (to mina-core) seems to resolve the issue by following the
> reproduction case I have on the [Gerrit issue
> tracker|https://code.google.com/p/gerrit/issues/detail?id=3685].
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)