[ 
https://issues.apache.org/jira/browse/DIRMINA-1096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16664532#comment-16664532
 ] 

jpalacios commented on DIRMINA-1096:
------------------------------------

Hi [~elecharny]
 Thank you for getting back to me.

This was not a load test but a live production environment. I am currently 
working on understanding the type of load the system was in when this incident 
happened. I have my suspicion that there was a load spike but I can't say for 
sure at the moment as I'm still collecting data from the environment (I do not 
have direct access to it so it's taking a bit of time).

I don't believe there were any pending writes. The {{writeRequestQueue}} is not 
where the memory is being retained.

What I'm seeing though is that there are roughly 94K instances of 
{{NioSocketSession}}. Looking at their referenced {{SocketChannelImpl}} I see 
that 43 are open and 93983 are closed.

As I said before (and I may not have been clear, so I apologise) all 94K seem 
to have been nested as explained in the ticket's description. This causes the 
resulting object graph to have a massive retained size, but no individual 
session has a large shallow size.

On looking at the graph in JProfiler, it would seem that the GC root is a 
thread where the stack references:
{code:java}
Thread -> ServerSessionImpl -> SshServer -> JohnsonAwarePublicKeyAuthenticator 
-> CachingPublicKeyAuthenticator -> ConcurrentHashMap -> 
ConcurrentHashMap$Node[] -> ConcurrentHashMap$Node -> ServerSessionImpl -> 
MinaSession -> NioSocketSession -> ...
{code}
Is it possible that in some circumstances {{CachingPublicKeyAuthenticator}} may 
not clear its cache correctly and as a result closed sessions are left around 
which start nesting within each other until the heap blows?

Cheers

Juan Palacios

> Massive object graph in NioSocketSession
> ----------------------------------------
>
>                 Key: DIRMINA-1096
>                 URL: https://issues.apache.org/jira/browse/DIRMINA-1096
>             Project: MINA
>          Issue Type: Bug
>    Affects Versions: 2.0.16
>            Reporter: jpalacios
>            Priority: Major
>
> I'm looking at a heap dump from one of our customers where the retained heap 
> size for some {{NioSocketSession}} instances is almost 1GB.
> From the looks of the dump MINA has created a massive object graph where:
> {code}
> NioSocketSession -> SelectionKeyImpl -> EpollSelectorImpl -> HashMap -> 
> SelectionKeyImpl -> NioSocketSession -> ...
> {code}
> From the looks of the obeject IDs these are not loops
> Each individual object is not large by itself but at the top of the graph the 
> accumulated retained size is enough to produce an OOME
> Could you help me understand how MINA can produce such a massive object 
> graph? Should MINA apply any defense mechanism to prevent this??



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to