tai-jen gordon created AMQ-6333:
-----------------------------------
Summary: queue messages lost during failover
Key: AMQ-6333
URL: https://issues.apache.org/jira/browse/AMQ-6333
Project: ActiveMQ
Issue Type: Bug
Components: activemq-leveldb-store
Affects Versions: 5.13.2
Environment: a 3 node cluster running on RHEL 6.7
Reporter: tai-jen gordon
In this 3 node cluster running ActiveMQ 5.13.2 and levelDB, we tested failover
with a producer that generates 200 AccountIds ( from 0 to 199 ) , and expected
that the consumer would receive the same number of messages after a triggered
failover ( kill -9 master-process-id).
The producer used syncSend, the consumer used transactional session, and the
destination is a queue.
We printed out the message sequence and its corresponding accountID on both the
producer and the consumer consoles after a message was successful send and
receive respectively.
The result showed that the producer sent 199 messages ( one failed due to fail
over), but the consumer received much less messages than 199.
For example, on the consumer console, before the failover we had
......
received = 13 accountid=12
received = 14 accountid=13
then, right after the failover the consumer console had
received = 15 accountid=44
received = 16 accountid=45
...
Notice that accounted from 14 to 43 are ever lost.
We scan the broker logs of both the new and old master and noticed that the
pagedInPendingDispatch.size in the "old" master had value = 30 which matches
the number of lost messages. We repeated this test few times and each time
this size matched the number of lost messages.
The configuration we used is listed below:
Broker configuration:
<broker xmlns="http://activemq.apache.org/schema/core" brokerName="broker"
dataDirectory="${activemq.data}" useJmx="true" useShutdownHook="false"
shutdownOnMasterFailure="true" systemExitOnShutdown="true"
systemExitOnShutdownExitCode="200">
<destinationPolicy>
<policyMap>
<policyEntries>
<policyEntry topic=">" >
<pendingMessageLimitStrategy>
<constantPendingMessageLimitStrategy limit="1000"/>
</pendingMessageLimitStrategy>
</policyEntry>
<policyEntry queue=">" useCache="false" expireMessagesPeriod="0" />
</policyEntries>
</policyMap>
</destinationPolicy> <persistenceAdapter>
<replicatedLevelDB
directory="${activemq.base}/leveldb-data"
replicas="3"
bind="tcp://0.0.0.0:0"
zkAddress="test1:2181,test2:2181,test3:2181"
zkPath="/activemq/leveldb"
sync="quorum_disk"
paranoidChecks="true"
logCompression="snappy"
/>
</persistenceAdapter>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)