Messages are sometimes skipped when using JDBC master/slave
------------------------------------------------------------
Key: AMQ-1658
URL: https://issues.apache.org/activemq/browse/AMQ-1658
Project: ActiveMQ
Issue Type: Bug
Components: Message Store
Affects Versions: 5.0.0
Environment: Windows Xp sp2, Java 6, MySql 5.0.51a (InnoDb table)
Reporter: François Guillemette
Fix For: 5.1.0
Sometime, a (or some) message(s) hang in the queue while no consumer eat it. It
happen more often after failover.
Scenario:
2 brokers (jdbc master/slave), 2 consumers (with prefetch set to 1), 2 producers
Producers :
ant producer -Durl="failover:(tcp://localhost:61618,tcp://localhost:61619)"
-Ddurable=true -Dmax=500
Consumer 1:
ant consumer -Durl="failover:(tcp://localhost:61618,tcp://localhost:61619)"
-Dmax=10000 -DclientId=c1
Consumer 2:
ant consumer -Durl="failover:(tcp://localhost:61618,tcp://localhost:61619)"
-Dmax=10000 -DclientId=c2
1 - Start the two brokers (one will be master, the other will be slave)
2 - Start the producers, consumers
3 - Wait a little,
4 - Kill the master -> slave become master
5 - Producers continue producing, consumers continue consuming
6 - After all producers finish their task, the consumer will finish consuming,
and sometimes there still messages left in the queue (in the database, and
using JMX to see the state of the queue).
7 - Restart a new broker, kill the master
8 - The messages will be consumed
There is a race condition between the time the message is set with the broker
sequence number (RegionBroker.java in send method), and the time it is actually
put in the database (DefaultJDBCAdapter.java in doAddMessage method).
I have seen that sometimes message with higher sequence number are put in
database before a lower sequence number. For example: 386 is put in the
database before 385. If it is happening when JDBCMessageStore is recovering the
next message (lastMessageId is 384), then 386 will be fetched and the
lastMessageId will change to be 386. 385 is then put in the db but never
retrieved (stopping and restarting the broker will allow to retrieve the
message because at start the lastMessageId is -1).
I have synchronized the code inside the RegionBroker.send, and I don't have
gaps anymore. This is a workaround for us since we don't process a lot of
message. But maybe a more elegant solution is to set the brokerSequenceId in
doAddMessage of JDBCAdapter (I may be wrong, I didn't check if the
brokerSequenceId is used elsewhere).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.