We are using one 3-node master/slave cluster using level-db in our system. The requirement is to have reliable message delivery even during fail over.
To test failover, we used a producer that generates 200 AccountIds ( from 0 to 199 ) as the messages, and expected that the consumer would receive the same number of messages after a triggered failover ( kill -9 master-process-id). However, after fail over the consumer receive fewer messages. In the "old" master log, the last line before the kill was: 2016-05-29 22:35:32,809 | | DEBUG | queue://dest23, subscriptions=1, memory=0%, size=31, pending=0 toPageIn: 0, Inflight: 1, pagedInMessages.size 31, pagedInPendingDispatch.size 30, enqueueCount: 44, dequeueCount: 13, memUsage:39091 | org.apache.activemq.broker.region.Queue | ActiveMQ BrokerService[broker] Task-5 In the "new" master, we fisrt saw toPageIn = -13 : 2016-05-29 22:35:39,097 | DEBUG | queue://dest23, subscriptions=0, memory=0%, size=-13, pending=0 toPageIn: -13, Inflight: 0, pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 0, dequeueCount: 0, memUsage:0 | org.apache.activemq.broker.region.Queue | main After receiving 13 messages, the toPageIn reached 0: 2016-05-29 22:35:48,760 | DEBUG | queue://dest23, subscriptions=1, memory=0%, size=-1, pending=0 toPageIn: -1, Inflight: 0, pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 12, dequeueCount: 0, memUsage:1261 | org.apache.activemq.broker.region.Queue | ActiveMQ BrokerService[c2c_2] Task-5 2016-05-29 22:35:48,837 | DEBUG | commit: TX:ID:LB077878-62929-1464575723526-1:1:58 syncCount: 1 | org.apache.activemq.transaction.LocalTransaction | ActiveMQ Transport: tcp:///10.25.65.230:62947@61616 2016-05-29 22:35:48,850 | DEBUG | c2c_2 Message ID:LB077878-62929-1464575723526-1:1:2:1:58 sent to queue://dest23 | org.apache.activemq.broker.region.Queue | ActiveMQ Transport: tcp:///10.25.65.230:62947@61616 2016-05-29 22:35:48,851 | DEBUG | queue://dest23, subscriptions=1, memory=0%, size=0, pending=0 toPageIn: 0, Inflight: 0, pagedInMessages.size 0, pagedInPendingDispatch.size 0, enqueueCount: 13, dequeueCount: 0, memUsage:1261 | org.apache.activemq.broker.region.Queue | ActiveMQ BrokerService[c2c_2] Task-5 Finally, the producer completed sending and the consumer stopped to receive any more messages. And 30 messages not account for on the consumer side. The consumer application console had: received = 14 accountid=13 // this is last output before fail over jms Excpetion listener class javax.jms.TransactionRolledBackException Transaction completion in doubt due to failover. Forcing rollback of TX:ID:LB077878-62921-1464575689160-1:1:14 errorHandler class javax.jms.TransactionRolledBackException errorHandlerTransaction completion in doubt due to failover. Forcing rollback of TX:ID:LB077878-62921-1464575689160-1:1:14 jms Excpetion listener class javax.jms.TransactionRolledBackException Transaction completion in doubt due to failover. Forcing rollback of TX:ID:LB077878-62921-1464575689160-1:1:14 May 29, 2016 10:35:40 PM org.springframework.jms.listener.DefaultMessageListenerContainer handleListenerSetupFailure WARNING: Setup of JMS message listener invoker failed for destination 'dest23' - trying to recover. Cause: Transaction completion in doubt due to failover. Forcing rollback of TX:ID:LB077878-62921-1464575689160-1:1:14 javax.jms.TransactionRolledBackException: Transaction completion in doubt due to failover. Forcing rollback of TX:ID:LB077878-62921-1464575689160-1:1:14 at org.apache.activemq.state.ConnectionStateTracker.restoreTransactions(ConnectionStateTracker.java:255) at org.apache.activemq.state.ConnectionStateTracker.restore(ConnectionStateTracker.java:192) at org.apache.activemq.transport.failover.FailoverTransport.restoreTransport(FailoverTransport.java:855) at org.apache.activemq.transport.failover.FailoverTransport.doReconnect(FailoverTransport.java:1033) at org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:149) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:133) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:48) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) May 29, 2016 10:35:40 PM org.springframework.jms.listener.DefaultMessageListenerContainer refreshConnectionUntilSuccessful INFO: Successfully refreshed JMS Connection jms Excpetion listener class javax.jms.IllegalStateException The Consumer is closed May 29, 2016 10:35:42 PM org.springframework.jms.listener.DefaultMessageListenerContainer handleListenerSetupFailure WARNING: Setup of JMS message listener invoker failed for destination 'dest23' - trying to recover. Cause: The Consumer is closed javax.jms.IllegalStateException: The Consumer is closed at org.apache.activemq.ActiveMQMessageConsumer.checkClosed(ActiveMQMessageConsumer.java:862) at org.apache.activemq.ActiveMQMessageConsumer.receive(ActiveMQMessageConsumer.java:625) at org.apache.activemq.jms.pool.PooledMessageConsumer.receive(PooledMessageConsumer.java:67) at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveMessage(AbstractPollingMessageListenerContainer.java:420) at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:300) at org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:253) at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1158) at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1150) at org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1047) at java.lang.Thread.run(Thread.java:745) May 29, 2016 10:35:43 PM org.springframework.jms.listener.DefaultMessageListenerContainer refreshConnectionUntilSuccessful INFO: Successfully refreshed JMS Connection message JMS Redelivered=false received = 15 accountid=44 // the first received message after fail over. notice that accounted from 14 to 43 are ever lost. Also the number of missing messages matches the pagedInPendingDispatch.size (30) of the "old" master . So it seemed that the messages in pagedInPendingDispatch are not redelivered when failover occurred. Please help. Thanks! ------------------------------------------------------------------------------------------------------------- used ActiveMQ version : apache-activemq-5.13.2 Broker configuration: <broker xmlns="http://activemq.apache.org/schema/core" brokerName="broker" dataDirectory="${activemq.data}" useJmx="true" useShutdownHook="false" shutdownOnMasterFailure="true" systemExitOnShutdown="true" systemExitOnShutdownExitCode="200"> <destinationPolicy> <policyMap> <policyEntries> <policyEntry topic=">" > <pendingMessageLimitStrategy> <constantPendingMessageLimitStrategy limit="1000"/> </pendingMessageLimitStrategy> </policyEntry> <policyEntry queue=">" useCache="false" expireMessagesPeriod="0" /> </policyEntries> </policyMap> </destinationPolicy> <persistenceAdapter> <replicatedLevelDB directory="${activemq.base}/leveldb-data" replicas="3" bind="tcp://0.0.0.0:0" zkAddress="test1:2181,test2:2181,test3:2181" zkPath="/activemq/leveldb" sync="quorum_disk" paranoidChecks="true" logCompression="snappy" /> </persistenceAdapter> -- View this message in context: http://activemq.2283324.n4.nabble.com/Messages-in-pagedInPendingDispatch-are-not-redelivered-after-failover-therefore-caused-message-loss-tp4712488.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.