I don't know if anyone is looking into this or have any ideas, but I have made some new discoveries that might help in figuring out what is going on.
I still have not been able to replicate the issue in a smaller/more controlled environment, even though pretty much all is the same in regards to broker, configuration application and client setup. I suspect it might be caused in part due to the number of clients in the real environment, something I can not really simulate. What I have found though, is two workarounds, neither of which are ideal, but maybe they can give a hint to someone other than me. Workaround 1: If I remove failover nodes in the RA configuration the problem won't appear, so that means the config is roughly: RA1: failover:(tcp://broker1:61616)?nested.soLinger=10&nested.soTimeout=200000&jms.rmIdFromConnectionId=true&maxReconnectAttempts=0 RA2: failover:(tcp://broker2:61616)?nested.soLinger=10&nested.soTimeout=200000&jms.rmIdFromConnectionId=true&maxReconnectAttempts=0 And so on... This eliminates the issue entirely, but at the cost of one RA and corresponding MDB not failing over and thus are unable to perform any work for the duration of broker downtime. Workaround 2: If I add "initialReconnectDelay" to a value of 5000 or more this sort of fixes the issue. Example of one RA connectionURL: failover:(tcp://broker1:61616,tcp://broker2:61616,tcp://broker3:61616)?nested.soLinger=10&nested.soTimeout=200000&jms.rmIdFromConnectionId=true&randomize=false&priorityBackup=true&maxReconnectAttempts=0&initialReconnectDelay=5000 This kind of works, but at least with 5000 delay I still get the lock every now and then, with the upside that an additional broker restart fixes it. I do not want this setup in a production environment but at least it sort of works without any major impact on application performance. Without much evidence to support it I think the issue might be explained in the ActiveMQ Failover documentation <https://activemq.apache.org/failover-transport-reference> . Under transactions they describe an issue that sounds sort of similar to what I am seeing, but release notes for the fix version seem to be offline so I have need unable to track the specific fix implemented. Perhaps it is something that could be adopted in the Artemis broker as well? Any thought on this? Or is there something inherently incompatible with my setup and the Artemis broker? Br, Anton -- Sent from: http://activemq.2283324.n4.nabble.com/ActiveMQ-User-f2341805.html