Broker can lose messages during master/slave failover when master undergoes a
controlled shutdown
-------------------------------------------------------------------------------------------------
Key: AMQ-3364
URL: https://issues.apache.org/jira/browse/AMQ-3364
Project: ActiveMQ
Issue Type: Bug
Components: Broker
Affects Versions: 5.5.0, 5.4.2
Reporter: Martin Serrano
Priority: Critical
I see this problem consistently when a producer is continuously sending
messages and the master is shutdown in a controlled fashion. When the master
broker is undergoing a controlled shutdown, the BrokerService.stop() method
stops things in this order:
* services
* connectors
* registered vm transports
* broker
So there is a period where the broker will still process sends after other
(apparently necessary) facilities have been shutdown. I have not followed the
code paths to understand exactly what goes wrong, but I traced enough to tell
that messages sent in this interval can disappear. That is, the client send
call will return without error but after failover the slave will not replay the
message.
This appears to only be an issue during a controlled shutdown. Process death
should not cause this problem.
I'm currently working around this by having the BrokerService set a stopping
flag and having the MasterBroker check this flag and reject sends (with a new
exception class) if true. My client code then detects this case and just
retries until the failover is complete.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira