[jira] Updated: (AMQ-1918) AbstractStoreCursor.size gets out of synch with Store size and blocks consumers

Richard Yarger (JIRA) Mon, 30 Mar 2009 09:59:55 -0700

     [ 
https://issues.apache.org/activemq/browse/AMQ-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Richard Yarger updated AMQ-1918:
--------------------------------

    Attachment: NegativeQueueCursorSupport.java

I have created a unit test that can reproduce the issue.
It takes around 5 min to complete.

I modeled the test off of the CursorSupport test case.
I just added a second queue and more specific memory settings.
I also included tests with different prefetch values.
Lowering prefetch seems to have a direct impact on the issue.

testWithDefaultPrefetch() and testWithDefaultPrefetchFiveConsumers()
are usually the ones to fail.
 
I am reproducing the issue quite easily with this test case.
So let me know if you cannot.
Thanks.

> AbstractStoreCursor.size gets out of synch with Store size and blocks 
> consumers
> -------------------------------------------------------------------------------
>
>                 Key: AMQ-1918
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1918
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Message Store
>    Affects Versions: 5.1.0
>            Reporter: Richard Yarger
>            Assignee: Rob Davies
>            Priority: Critical
>             Fix For: 5.3.0
>
>         Attachments: activemq.xml, NegativeQueueCursorSupport.java, 
> testAMQMessageStore.zip, testdata.zip
>
>
> In version 5.1.0, we are seeing our queue consumers stop consuming for no 
> reason.
> We have a staged queue environment and we occasionally see one queue display 
> negative pending message counts that hang around -x, rise to -x+n gradually 
> and then fall back to -x abruptly. The messages are building up and being 
> processed in bunches but its not easy to see because the counts are negative. 
> We see this behavior in the messages coming out of the system. Outbound 
> messages come out in bunches and are synchronized with the queue pending 
> count dropping to -x.
> This issue does not happen ALL of the time. It happens about once a week and 
> the only way to fix it is to bounce the broker. It doesn't happen to the same 
> queue everytime, so it is not our consuming code.
> Although we don't have a reproducible scenario, we have been able to debug 
> the issue in our test environment.
> We traced the problem to the cached store size in the AbstractStoreCursor.
> This value becomes 0 or negative and prevents the AbstractStoreCursor from 
> retrieving more messages from the store. (see AbstractStoreCursor.fillBatch() 
> )
> We have seen size value go lower than -1000.
> We have also forced it to fix itself by sending in n+1 messages. Once the 
> size goes above zero, the cached value is refreshed and things work ok again.
> Unfortunately, during low volume times, it could be hours before n+1 messages 
> are received, so our message latency can rise during low volume times.... :(
> I have attached our broker config.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (AMQ-1918) AbstractStoreCursor.size gets out of synch with Store size and blocks consumers

Reply via email to