Keith Turner created ACCUMULO-4809:
--------------------------------------

             Summary: Session manager clean up can happen when lock held.
                 Key: ACCUMULO-4809
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4809
             Project: Accumulo
          Issue Type: Bug
    Affects Versions: 1.8.1, 1.7.3
            Reporter: Keith Turner
             Fix For: 1.7.4, 1.9.0, 2.0.0


While working on [PR #382|https://github.com/apache/accumulo/pull/382] for 
ACCUMULO-4782 I noticed a significant concurrency bug.  Before #382 their was a 
single lock for the session manager. The session manager will clean up idle 
sessions.  This clean up should happen outside the session manager lock, 
because all tserver read/write operation use the session manger so it should be 
responsive.

The bug is the following.
 * Both getActiveScansPerTable() and getActiveScans() lock the session manager 
and then lock idleSessions.  See [SessionManager line 
233|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L233]
 
 * The sweep() method locks idleSessions and does cleanup while this lock is 
held. [See SessionManager 
200|https://github.com/apache/accumulo/blob/rel/1.7.3/server/tserver/src/main/java/org/apache/accumulo/tserver/session/SessionManager.java#L200]
 

Therefore it is possible for getActiveScansPerTable() or getActiveScans() to 
lock the session manager and then block trying to lock idleSessions while 
cleanup is happening in sweep().  This will block all access to the session 
manager while cleanup happens.

The changes in #382 will fix this for 1.9.0 and 2.0.0.  However I Am not sure 
about backporting #382 to 1.7.  A more targeted fix could be made for 1.7 or 
#382 could be backported.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to