Small window in wakeup logic for PooledTaskRunner - task can get executed in 
parallell
--------------------------------------------------------------------------------------

                 Key: AMQ-1686
                 URL: https://issues.apache.org/activemq/browse/AMQ-1686
             Project: ActiveMQ
          Issue Type: Bug
          Components: Broker
    Affects Versions: 5.0.0
         Environment: windows XP
            Reporter: Gary Tully


org.apache.activemq.broker.region.cursors.CursorDurableTest fails on windows 
sometimes with the error:

Exception in thread "Persistence Adaptor Task" java.lang.NullPointerException
        at 
org.apache.activemq.store.amq.AMQMessageStore$4.execute(AMQMessageStore.java:381)
        at 
org.apache.activemq.util.TransactionTemplate.run(TransactionTemplate.java:44)
        at 
org.apache.activemq.store.amq.AMQMessageStore.doAsyncWrite(AMQMessageStore.java:374)
        at 
org.apache.activemq.store.amq.AMQMessageStore.asyncWrite(AMQMessageStore.java:341)
        at 
org.apache.activemq.store.amq.AMQMessageStore$1.iterate(AMQMessageStore.java:95)
        at 
org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:122)
        at 
org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:43)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 303.672 sec <<< 
FAILURE!

The problem appears to be in the interaction between wakup and runTask in 
PooledTaskRunner
iterating is set to false in a finally  and queued is checked in a separate 
sync block.
if wakeup is called in this window, it can set queued and find iterating false 
so it will execute, and runTask will find queued true and it too will execute.

the fix is to include the queued check in the finally block so that iterating 
and queued are checked at the same time. I will attach a patch with this fix.
I attempted to reproduce this problem with a unit test but I did not have any 
real success. the window is quite small. I will include the unit test in case 
it can be improved upon.

chirino merged a fix yesterday that addresses the symptom of this issue in a 
different way,
http://svn.apache.org/viewvc?view=rev&revision=650956

The added synchronisation means that parallel calls by the 
PooledTaskRunner.asyncWrite are serialised on the method access.
This fix addresses the route cause and can negate the need for the 
synchronisation.

fyi: In the test, the paralell calls can come from flush() and from the 
asyncWrite task.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to