Scott Feldstein created AMQ-5082:
------------------------------------
Summary: ActiveMQ replicatedLevelDB cluster breaks, all nodes stop
listening
Key: AMQ-5082
URL: https://issues.apache.org/jira/browse/AMQ-5082
Project: ActiveMQ
Issue Type: Bug
Components: activemq-leveldb-store
Affects Versions: 5.9.0, 5.10.0
Reporter: Scott Feldstein
Priority: Critical
Attachments: mq-node1-cluster.failure, mq-node2-cluster.failure,
mq-node3-cluster.failure, zookeeper.out-cluster.failure
I have a 3 node amq cluster and one zookeeper node using a replicatedLevelDB
persistence adapter.
{code}
<persistenceAdapter>
<replicatedLevelDB
directory="${activemq.data}/leveldb"
replicas="3"
bind="tcp://0.0.0.0:0"
zkAddress="zookeep0:2181"
zkPath="/activemq/leveldb-stores"/>
</persistenceAdapter>
{code}
After about a day or so of sitting idle there are cascading failures and the
cluster completely stops listening all together.
I can reproduce this consistently on 5.9 and the latest 5.10 (commit
2360fb859694bacac1e48092e53a56b388e1d2f0). I am going to attach logs from the
three mq nodes and the zookeeper logs that reflect the time where the cluster
starts having issues.
The cluster stops listening Mar 4, 2014 4:56:50 AM (within 5 seconds).
The OSs are all centos 5.9 on one esx server, so I doubt networking is an issue.
If you need more data it should be pretty easy to get whatever is needed since
it is consistently reproducible.
This bug may be related to AMQ-5026, but looks different enough to file a
separate issue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)