[
https://issues.apache.org/jira/browse/AMQ-5082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13967040#comment-13967040
]
Kevin McLaughlin commented on AMQ-5082:
---------------------------------------
Shutting down node1 from the attached thread dumps properly promoted node3 to
be the master. So, it appears that node1 is somehow maintaining its leadership
(but not listening) when this happens.
> ActiveMQ replicatedLevelDB cluster breaks, all nodes stop listening
> -------------------------------------------------------------------
>
> Key: AMQ-5082
> URL: https://issues.apache.org/jira/browse/AMQ-5082
> Project: ActiveMQ
> Issue Type: Bug
> Components: activemq-leveldb-store
> Affects Versions: 5.9.0, 5.10.0
> Reporter: Scott Feldstein
> Priority: Critical
> Attachments: 03-07.tgz, amq_5082_threads.tar.gz,
> mq-node1-cluster.failure, mq-node2-cluster.failure, mq-node3-cluster.failure,
> zookeeper.out-cluster.failure
>
>
> I have a 3 node amq cluster and one zookeeper node using a replicatedLevelDB
> persistence adapter.
> {code}
> <persistenceAdapter>
> <replicatedLevelDB
> directory="${activemq.data}/leveldb"
> replicas="3"
> bind="tcp://0.0.0.0:0"
> zkAddress="zookeep0:2181"
> zkPath="/activemq/leveldb-stores"/>
> </persistenceAdapter>
> {code}
> After about a day or so of sitting idle there are cascading failures and the
> cluster completely stops listening all together.
> I can reproduce this consistently on 5.9 and the latest 5.10 (commit
> 2360fb859694bacac1e48092e53a56b388e1d2f0). I am going to attach logs from
> the three mq nodes and the zookeeper logs that reflect the time where the
> cluster starts having issues.
> The cluster stops listening Mar 4, 2014 4:56:50 AM (within 5 seconds).
> The OSs are all centos 5.9 on one esx server, so I doubt networking is an
> issue.
> If you need more data it should be pretty easy to get whatever is needed
> since it is consistently reproducible.
> This bug may be related to AMQ-5026, but looks different enough to file a
> separate issue.
--
This message was sent by Atlassian JIRA
(v6.2#6252)