[ 
https://issues.apache.org/activemq/browse/AMQ-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_40074
 ] 

David Sitsky commented on AMQ-1251:
-----------------------------------

In case it is hard for you to reproduce, here are the relevant statistics 
obtained using JMX when the unit test hangs after the first batch of 1000 
messages are processed:

For the work-items queue (which has two worker thread consumers):

ConsumerCount: 2
DequeueCount: 1000
DispatchCount: 1000
EnqueueCount: 1001

This makes sense - the 1001 enqueue count indicates the message the master has 
sent to the work-items queue to indicate to the workers to start processing the 
second batch of 1000 items, but for whatever reason, this message hasn't been 
dispatched to a worker.

For the two worker subscriptions on this queue, here are their stats:

Worker 1:

DequeueCounter: 998
DispatchedCounter: 998
DispatchedQueueSize: 0
EnqueueCounter: 1001
MaximumPendingMessageLimit: 0
PendingQueueSize: 3
PrefetchSize: 0

Worker 2:

DequeueCounter: 2
DispatchedCounter: 2
DispatchedQueueSize: 0
EnqueueCounter: 1001
MaximumPendingMessageLimit: 0
PendingQueueSize: 998
PrefetchSize: 0

I can also confirm that all 3 threads (two workers, one master) and waiting in 
receive(), by dumping the thread stacks:

at 
org.apache.activemq.MessageDispatchChannel.dequeue(MessageDispatchChannel.java:75)
-locked <0x199c50d8> (a java.lang.Object)
at 
org.apache.activemq.ActiveMQMessageConsumer.dequeue(ActiveMQMessageConsumer.java:405)
at 
org.apache.activemq.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:453)

Looking at the numbers, it really looks like a new message has been put into 
the queue, but hasn't been dispatched.

Is there any more information you need apart from the above and the unit tests 
provided to squash this issue?




> Broker stops delivering messages to some consumers
> --------------------------------------------------
>
>                 Key: AMQ-1251
>                 URL: https://issues.apache.org/activemq/browse/AMQ-1251
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 4.1.0
>         Environment: WinXP
>            Reporter: Vadim Pesochinskiy
>            Assignee: Rob Davies
>             Fix For: 5.0.0
>
>         Attachments: TestActiveMQ.java, TestActiveMQSyncReceive.java
>
>
> I have around 40 consumers taking messages from a single queue. After awhile 
> 1 or 2 consumers stop receiveing any messages. Going to JMX and stopping 
> corresponding connection causes re-connect and messages are delivered again.
> I reproduced it twice in QA enviroment and now it happened in production. I 
> tried to instrument the code and set the log in debug, but that changed 
> timing and I failed to reproduce it after the changes.
> I suspect that runtime association b/w Queue and Consumer objects is lost on 
> the Broker side. 
> One of the suspects is the empty catch block in the RoundRobinDispatchPolicy 
> (line 64) class. It is possible that the CopyOnWrite array list is messed up 
> and it fails when removed consumer is added back. 
> BTW CopyOnWrite list is good when you mostly read, but not so good when you 
> write for every message delivery and empty catch blocks are bad in any case.
> if (firstMatchingConsumer != null) {
>       // Rotate the consumer list.
>       try {
>                 consumers.remove(firstMatchingConsumer);
>                 consumers.add(firstMatchingConsumer);
>       } catch (Throwable bestEffort) {
>       }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to