Jonathan Fortier created AMQCPP-506:
---------------------------------------

             Summary: Exception "attempt to unlock read lock, not locked by 
current thread" when doing performance testing
                 Key: AMQCPP-506
                 URL: https://issues.apache.org/jira/browse/AMQCPP-506
             Project: ActiveMQ C++ Client
          Issue Type: Bug
          Components: Decaf
    Affects Versions: 3.7.1
         Environment: Windows 7
            Reporter: Jonathan Fortier
            Assignee: Timothy Bish


While doing long-term performance testing of our application (~10,000 
messages/second), an exception is thrown after a few hours of operation. Here 
is the details of the exception:
IllegalMonitorStateException: attempt to unlock read lock, not locked by 
current thread
Stack trace:
{quote}
`anonymous namespace}}'::Sync::tryReleaseShared(int unused=1)  Line 205
decaf::util::concurrent::locks::AbstractQueuedSynchronizer::releaseShared(int 
arg=1)  Line 1630 + 0x11 bytes
`anonymous namespace'::ReadLock::unlock()  Line 660
activemq::core::kernels::ActiveMQSessionKernel::lookupConsumerKernel(...)  Line 
1336
activemq::core::ActiveMQSessionExecutor::dispatch(...)  Line 151 + 0x47 bytes
activemq::core::ActiveMQSessionExecutor::iterate()  Line 182
activemq::threads::DedicatedTaskRunner::run()  Line 141 + 0x13 bytes
decaf::lang::Thread::run()  Line 143
{quote}

After a little debugging, I identified a code defect that seems to be the cause 
of our problem. In class 
decaf::util::concurrent::locks::ReentrantReadWriteLock, the class member 
"cachedHoldCounter" is used to optimize performance. However, that member is 
accessed concurrently by multiple thread, but the modifications of that member 
are not atomic, which implies that a thread can read a partly updated member 
(i.e. the count of thread #2 with pointer to thread #1). In that case, lock 
logic get all messed up, and we end up with strange behavior (eg. infinite 
waiting for lock). 

I wrote a unit test to reproduce the problem (see attachment). However, since 
this is a race condition, it may take a few run to reproduce.

When I commented cachedHoldCounter-related code from ReentrantReadWriteLock 
(i.e. always go in ThreadLocal), the problem seems to be gone. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to