Scott Feldstein created AMQ-5082:
------------------------------------

             Summary: ActiveMQ replicatedLevelDB cluster breaks, all nodes stop 
listening
                 Key: AMQ-5082
                 URL: https://issues.apache.org/jira/browse/AMQ-5082
             Project: ActiveMQ
          Issue Type: Bug
          Components: activemq-leveldb-store
    Affects Versions: 5.9.0, 5.10.0
            Reporter: Scott Feldstein
            Priority: Critical
         Attachments: mq-node1-cluster.failure, mq-node2-cluster.failure, 
mq-node3-cluster.failure, zookeeper.out-cluster.failure

I have a 3 node amq cluster and one zookeeper node using a replicatedLevelDB 
persistence adapter.

{code}
        <persistenceAdapter>
            <replicatedLevelDB
              directory="${activemq.data}/leveldb"
              replicas="3"
              bind="tcp://0.0.0.0:0"
              zkAddress="zookeep0:2181"
              zkPath="/activemq/leveldb-stores"/>
        </persistenceAdapter>
{code}

After about a day or so of sitting idle there are cascading failures and the 
cluster completely stops listening all together.

I can reproduce this consistently on 5.9 and the latest 5.10 (commit 
2360fb859694bacac1e48092e53a56b388e1d2f0).  I am going to attach logs from the 
three mq nodes and the zookeeper logs that reflect the time where the cluster 
starts having issues.

The cluster stops listening Mar 4, 2014 4:56:50 AM (within 5 seconds).

The OSs are all centos 5.9 on one esx server, so I doubt networking is an issue.

If you need more data it should be pretty easy to get whatever is needed since 
it is consistently reproducible.

This bug may be related to AMQ-5026, but looks different enough to file a 
separate issue.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to