[ 
https://issues.apache.org/jira/browse/SSHD-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16669526#comment-16669526
 ] 

jpalacios commented on SSHD-854:
--------------------------------

Hi [~lgoldstein],
Thanks for getting back to me.

{quote}
In any case, even if you don't want or cannot upgrade to 2.x (and there are no 
guarantees that it would fix this issue), the fix I introduced in the ongoing 
code as a result for this issue can easily be duplicated and used in 1.7.0 
(though we cannot release such a patch, it should be relatively easy to 
introduce it as a local workaround by you).
{quote}

Upgrading my local environment is not really an issue. Upgrading our customer's 
environment is a different story.

I think I managed to partially reproduce this issue by inserting a breakpoint 
in {{CachingPublicKeyAuthenticator}}.

Steps to reproduce:
* Insert a breakpoint in [line 
53|https://github.com/apache/mina-sshd/blob/sshd-1.7.0/sshd-core/src/main/java/org/apache/sshd/server/auth/pubkey/CachingPublicKeyAuthenticator.java#L53]
 of {{CachingPublicKeyAuthenticator}}
* Start a session and let the thread stop at the break point
* Start a second session, and let it run through to normal completion (without 
ever letting the first thread move). It may also be necessary to let the first 
thread sit for about 30 to 60 seconds.
* Go back to the initial thread and step into the {{#addSessionListener}}. The 
sessions {{state}} is now {{Immediate}} (not sure what changed it though) so 
**the listener is not registered** and thus the cleanup logic in the 
{{CachingPublicKeyAuthenticator}} will never be called
* Let the thread complete it's processing
* Start a third session and let it stop at the break point. Observe the first 
session is still in the cache even though it's closed. It's been leaked.

I say I only _partially_ reproduced it because I'm not seeing the nested graph, 
only the leak in the {{CachingPublicKeyAuthenticator}}. However on my local 
environment a different implementation of {{Selector}} is being used. I need to 
do a similar test on a Linux environment to see if the nesting happens when 
using {{EPollSelectorImpl}}.

It's worth noting that the changes you've made to use session attributes 
instead of a map to store/retrieve the cached value, would solve the problem 
I'm describing. When do you think we can expect those changes to be released?

Cheers
Juan Palacios

> Massive object graph in NioSocketSession
> ----------------------------------------
>
>                 Key: SSHD-854
>                 URL: https://issues.apache.org/jira/browse/SSHD-854
>             Project: MINA SSHD
>          Issue Type: Bug
>            Reporter: jpalacios
>            Priority: Major
>
> I'm looking at a heap dump from one of our customers where the retained heap 
> size for some {{NioSocketSession}} instances is almost 1GB.
> From the looks of the dump MINA has created a massive object graph where:
> {code}
> NioSocketSession -> SelectionKeyImpl -> EpollSelectorImpl -> HashMap -> 
> SelectionKeyImpl -> NioSocketSession -> ...
> {code}
> From the looks of the obeject IDs these are not loops
> Each individual object is not large by itself but at the top of the graph the 
> accumulated retained size is enough to produce an OOME
> Could you help me understand how MINA can produce such a massive object 
> graph? Should MINA apply any defense mechanism to prevent this??



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to