Ilkka Virolainen created ARTEMIS-1864:
-----------------------------------------

             Summary: On-Demand Message Redistribution Can Spontaneously Start 
Failing in Single Direction
                 Key: ARTEMIS-1864
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1864
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: Broker
    Affects Versions: 2.5.0
         Environment: RHEL 6.2
            Reporter: Ilkka Virolainen


It's possible that the message redistribution of an Artemis cluster can 
spontaneously fail after running a while. I've witnessed this several times 
using a two node colocated replicating cluster with a basic configuration:
{code:java}
<cluster-connections>
   <cluster-connection name="my-cluster">
      <connector-ref>netty-connector</connector-ref>
      <retry-interval>500</retry-interval>
      <reconnect-attempts>5</reconnect-attempts>
      <use-duplicate-detection>true</use-duplicate-detection>
      <message-load-balancing>ON_DEMAND</message-load-balancing>
      <max-hops>1</max-hops>
      <discovery-group-ref discovery-group-name="my-discovery-group"/>
   </cluster-connection>
</cluster-connections>{code}
After running a while (approx. two weeks) one of the nodes (node a) will stop 
consuming messages from the other node's (node b) internal store-and-forward 
queue. This will result in message redistribution not working from node b -> 
node a but will work from node a -> node b. The cause for this is unknown: 
nothing of note is logged for either broker and JMX shows that the cluster 
topology and the broker cluster bridge connection are intact. This will cause 
significant problems, mainly:

1. Client communication will only work as expected if the clients happen to 
connect to the right brokers
2. Unconsumed messages will end up piling in the internal store-and-forward 
queue and consume unnecessary resources. It's also possible (but not verified) 
that when messages in the internal queue expire, they leak memory.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to