Hello JFry, I'm not expert but IMHO:
regarding the message Store limit is 30720 mb (current store usage is 0 mb). The data directory: D:\Apps\apache-activemq-5.18.3\data only has 25672 mb of usable space. - resetting to maximum available disk space: 25672 mb there is no a real problem as it's only notifying that the configured dimension for the local datastore cannot be achived as the available disk space is lower. So AMQ dinamically adapts itself. If the disk space would be a real issue, I imagine that you will get log row(s) with more critical messages. Regarding your main issue: in theory, if you restart the slower broker instance and you let it get the lease again (e.g. shutting down all the others), it should flush in the shared datastore the still pending messages it has still got inside its local datastore. Regarding instead "Is there a way to stop the broker from shutting down when it fails to renew its lease?" in the past I've seen the broker trying to reload itself after some errors, but maybe it's faster to put a kind of watchdog and reboot the machine if the broker process goes down. Question: does the 2nd log frame come from the same machine of the 1st log frame? *Distinti Saluti / *Kind Regards M.G. Il giorno ven 16 feb 2024 alle ore 12:24 Jack Fry (They/Them) <j...@scottlogic.com.invalid> ha scritto: > I've been investigating this more the whole week, I would really > appreciate a response. What I can see from inspecting the logs and > metricbeat analysis, is that the mssql connection in the first broker > starts getting very slow, which causes it to lose the master lease while > messages are still inflight. > > When the first broker fails to restore the lease it starts to shutdown and > the messages are lost without being processed. The jdbc persistence adapter > in the activemq.xml file does not set failIfLocked so that should be false. > Is there a way to stop the broker from shutting down when it fails to renew > its lease? > > Secondly, when the second broker starts and retries the failed messages, > it gets a Primary Key violation in object 'dbo.ACTIVEMQ_MSGS'. I believe > this is in the temp store within the server. Is there a way to flush this > temp store on a failure? > > We are using ActiveMq classic v5.18.3 and these failovers started not long > after upgrading from v5.13.2. > Thanks > > -----Original Message----- > From: Jack Fry (They/Them) <j...@scottlogic.com.INVALID> > Sent: Monday, February 12, 2024 4:16 PM > To: users@activemq.apache.org > Subject: RE: Message delivery failure and Primary Key violation on load > balanced ActiveMQ Broker > > [You don't often get email from j...@scottlogic.com.invalid. Learn why > this is important at > https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!Ck4v2Rc!kofdtlA2eSaTCFNZFyCBagp5OL9U44UPpit0TBF68I_Y-aK_nDto4v3lzpDFh_aErKoaiKKjZyQcmpUCQ-J9NArf0g$ > ] > > Looking at the logs again, we noticed this warning on the broker as well: > > Store limit is 30720 mb (current store usage is 0 mb). The data directory: > D:\Apps\apache-activemq-5.18.3\data only has 25672 mb of usable space. - > resetting to maximum available disk space: 25672 mb > > Could this be related? If there was not enough available disk space would > It cause this exception to occur? > > -----Original Message----- > From: Jack Fry (They/Them) <j...@scottlogic.com.INVALID> > Sent: Monday, February 12, 2024 2:35 PM > To: users@activemq.apache.org > Subject: Message delivery failure and Primary Key violation on load > balanced ActiveMQ Broker > > [You don't often get email from j...@scottlogic.com.invalid. Learn why > this is important at > https://urldefense.com/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!Ck4v2Rc!kofdtlA2eSaTCFNZFyCBagp5OL9U44UPpit0TBF68I_Y-aK_nDto4v3lzpDFh_aErKoaiKKjZyQcmpUCQ-J9NArf0g$ > ] > > (Re-sent this as I wanted to change the subject) > > Hi, > > Last week our ActiveMQ message broker lost a message from the queue. We > have a load balanced system with two separate brokers sharing a data store. > At some point during the transfer of the lease, the original lease holder > started shutting down, which started a cascade of failures as the Transport > Connection failed to deliver the message. > > 16:54:07.869 Starting Job Scheduler Store > 16:54:07.869 Persistence Adapter successfully started > 16:54:08.456 Apache ActiveMQ 5.18.3 (gbldnsrv4pw4564, > ID:GBLDNSRV4PW4564-50268-1705135858444-0:3) is starting > 16:54:10.807 gbldnsrv4pw4563, no longer able to keep the exclusive lock so > giving up being a master > 16:54:10.807 Apache ActiveMQ 5.18.3 (gbldnsrv4pw4563, > ID:GBLDNSRV4PW4563-63611-1706742264783-0:2) is shutting down > 16:54:10.807 Transport Connection to: tcp://10.18.136.56:51504 failed: > Broker BrokerService[gbldnsrv4pw4563] is being stopped > 16:54:10.807 socketQueue interrupted - stopping > 16:54:10.807 Could not accept connection during shutdown : null (null) > 16:54:10.823 Transport Connection to: tcp://10.18.136.38:57430 failed: > Broker BrokerService[gbldnsrv4pw4563] is being stopped > 16:54:10.823 Failed delivery for (MessageId: > ID-GBLDNSRV4PW4563-1706742293708-1-145409 on ExchangeId: > ID-GBLDNSRV4PW4563-1706742293708-1-145309). On delivery attempt: 0 caught: > org.springframework.jms.UncategorizedJmsException: Uncategorized exception > occurred during JMS processing; nested exception is javax.jms.JMSException: > Peer (vm://localhost#35661) disposed. > > However, when the second broker came to process the failed messages, there > was an Primary Key exception from the sqldb that the message was already > stored in the database. > > 16:54:37.357 Error while closing connection: Violation of PRIMARY KEY > constraint 'PK__ACTIVEMQ__3214EC27C81AADFA'. Cannot insert duplicate key in > object 'dbo.ACTIVEMQ_MSGS'. The duplicate key value is (510684907). > 16:54:37.374 Ignoring SQLException, java.io.IOException: Violation of > PRIMARY KEY constraint 'PK__ACTIVEMQ__3214EC27C81AADFA'. Cannot insert > duplicate key in object 'dbo.ACTIVEMQ_MSGS'. The duplicate key value is > (510684907). > 16:54:37.421 Ignoring SQLException, java.io.IOException: Violation of > PRIMARY KEY constraint 'PK__ACTIVEMQ__3214EC27C81AADFA'. Cannot insert > duplicate key in object 'dbo.ACTIVEMQ_MSGS'. The duplicate key value is > (510684909). > 16:54:37.421 Commit failed: Violation of PRIMARY KEY constraint > 'PK__ACTIVEMQ__3214EC27C81AADFA'. Cannot insert duplicate key in object > 'dbo.ACTIVEMQ_MSGS'. The duplicate key value is (510684909). > 16:54:37.483 Store COMMIT FAILED: > 16:54:40.824 Failed delivery for (MessageId: > ID-GBLDNSRV4PW4563-1706742293708-1-145409 on ExchangeId: > ID-GBLDNSRV4PW4563-1706742293708-1-145309). On delivery attempt: 1 caught: > java.lang.IllegalStateException: SendProcessor has not been started: > sendTo(activemq://queue: > 16:54:40.824 Failed delivery for (MessageId: > ID-GBLDNSRV4PW4563-1706742293708-1-145410 on ExchangeId: > ID-GBLDNSRV4PW4563-1706742293708-1-145312). On delivery attempt: 1 caught: > java.lang.IllegalStateException: SendProcessor has not been started: > sendTo(activemq://queue:) > > Would anyone here know what happened here? Is this a bug? > > Many thanks, > Jack >