[ 
https://issues.apache.org/jira/browse/SSHD-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726877#comment-13726877
 ] 

Guillaume Nodet commented on SSHD-249:
--------------------------------------

Nice catch.
I suppose using an ArrayList instead of an Array would work better since the 
ConcurrentHashMap#values()#toArray methods may always suffer from that problem.
I don't think we can iterator over the values() collection directly, because 
the same kind of problem may happen with the latch count.

                
> Data race in AbstractSession.close() may lead to NPE and blocks during 
> shutdown
> -------------------------------------------------------------------------------
>
>                 Key: SSHD-249
>                 URL: https://issues.apache.org/jira/browse/SSHD-249
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Marian Seitner
>            Priority: Critical
>
> The issue will be described in more details because a) it was enthralling to 
> debug and reproduce b) it might be interesting for others to fully understand 
> c) the probability to discover it is quite low, nonetheless it happened 
> during an aggressive integration test.
> The symptoms were a NullPointerException in AbstractSession.close() line 335 
> and a blocking thread which was supposed to shut down a server instance. 
> After looking at the code line 331 was identified as critical section as it 
> was the only place where a null value could have sneaked into the array.
> In order to run into this problem two preconditions have to be met: a) the 
> server is currently shutting down and thus trying to close all channels and 
> sessions b) one or more channels are unregistered at the same time.
> What happens in line 331:
> // Channel[] channelToClose = channels.values().toArray(new 
> Channel[channels.values().size()]);
> In case of unlucky timing the (first) call to channels.values() returns a 
> larger set (whose size is used to initialize the array) than the second call, 
> which is used to actually populate the array, so one or more array elements 
> are null.
> This single problem illustrates three issues:
> * Atomic treatment of non-atomic operations
> * Unguarded access to the channels map through (un)registerChannel() etc.
> * Incorrect exception handling (line 340 never gets called due to missing 
> latch.decrementAndGet() calls, so the IoSessionCloser is never executed 
> (which then leads to the blocking thread))
> If somehow manageable a patch will be provided.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to