[ https://issues.apache.org/jira/browse/AMQ-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Home updated AMQ-5125: ---------------------------- Attachment: AMQ-5125_made_synchronization_in_LevelDBStore_use_private_lock.5.9.2.patch I believe the issue that this ticket was raised against is the same one described here (same link as above), which badavis has analysed in his recent post: http://activemq.2283324.n4.nabble.com/ActiveMQ-message-dequeuing-hangs-td4681366.html I've gone through the code, and I agree with his findings. I've attached a patch to 5.9.2 based on this, in which LevelDBStore uses an internal private lock variable instead of _this_, which removes the deadlock that can happen when the LocalTransaction synchronises externally on the transactionStore during a rollback. Using this patch, the original (attached) test runs for hours without deadlocking, as do my own tests that previously deadlocked after a few thousand messages. Thanks, James > Broker and clients hang > ----------------------- > > Key: AMQ-5125 > URL: https://issues.apache.org/jira/browse/AMQ-5125 > Project: ActiveMQ > Issue Type: Bug > Affects Versions: 5.9.0, 5.10.0 > Environment: Windows 7, Linux > JAVA_HOME=C:\Program Files\Java\jdk1.7.0_07 > ActiveMQ 5.9.0 with LevelDB storage adapter enabled > Reporter: Albert Barmettler > Priority: Blocker > Attachments: > AMQ-5125_made_synchronization_in_LevelDBStore_use_private_lock.5.9.2.patch, > Broker-threaddump-1396544622970.tdump, > Clients-threaddump-1396544622970.tdump, VM.PNG, activemq.xml, > broker-threads-blocked.tdump, src.zip > > > JMS clients start to hang after a while in calls such as > session.createObjectMessage(). Both the broker and the hanging clients can't > be easily shut down when this happens - only forcefully applied kill's do the > job. > I'm using queues and transactional sessions. All clients (producers and > consumers) are in the same Java VM. There is only one JMS connection between > the application and the broker. Each client has its own session, but they all > share the same connection. > Normally, the data directory of the LevelDb contains only a few log files. > But in my case, the number of log files is steadily increasing. > Furthermore, I was able to track down the issue to following circumstance: > The problem only occurs, when consumers do a rollback instead of a commit > when they receive the message. The rollback / redelivery works as expected - > the same message is received again after a previous rollback. > As far as I can tell, the problem does not occur with KahaDb. > I'll attach a test program that provokes the error. It sets up a few hundred > queues, consumers and producers. The consumers just receive the message and > commit the session, but they also do "random" rollbacks. It can be observed > immediately that the number of files starts increasing in the data directory. > After a few minutes, the clients hang - sometimes sooner, sometimes later. > I'll also attach the config file for the broker. > I am aware, that heavy rollbacking should not happen in normal operation. But > from a long term stability perspective, this is a blocker for us. -- This message was sent by Atlassian JIRA (v6.2#6252)