Hey All An update in terms of consistently recreating the issue of messages going missing. We can recreate the issue with the following setup consistently. Artemis 2.13.0 installation comprising of:
* node1 masterA (port 61616) * node1 slaveB (port 61626) * node2 masterB (port 61616) * node2 slaveA (port 61616) with A, B backup groups for replication. Following is a snippet of address-settings that we use in all components: <address-setting match="#"> <dead-letter-address>DLQ</dead-letter-address> <expiry-address>ExpiryQueue</expiry-address> <redelivery-delay>500</redelivery-delay> <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier> <max-redelivery-delay>50000</max-redelivery-delay> <max-delivery-attempts>10</max-delivery-attempts> <redistribution-delay>1000</redistribution-delay> <max-size-bytes>10485760</max-size-bytes> <page-size-bytes>2097152</page-size-bytes> <message-counter-history-day-limit>10</message-counter-history-day-limit> <address-full-policy>PAGE</address-full-policy> <auto-create-queues>true</auto-create-queues> <auto-delete-queues>false</auto-delete-queues> <auto-create-addresses>true</auto-create-addresses> <auto-delete-addresses>false</auto-delete-addresses> <send-to-dla-on-no-route>true</send-to-dla-on-no-route> </address-setting> The issue we face is messages getting lost (ignored on target node) when they are moved from node1 masterA into node2 masterB using $.artemis.internal.sf.artemis-cluster.<masterB-UUID> queue. We see this only if paging is active on $.artemis.internal.sf.artemis-cluster.<masterB-UUID> queue. Here is log of msg sent from masterA in that case: [paging enabled] 2020-09-16 19:29:33.693 [373] TRACE o.a.a.a.c.s.i.ServerSessionImpl - send(message=CoreMessage[messageID=6442618526,durable=true,userID=91472afa-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:33 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2355,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTING_TYPE=1]]@1118152119, direct=true) being called 2020-09-16 19:29:33.693 [373] TRACE o.a.a.a.c.s.i.SecurityStoreImpl - checking access permissions to APP.INTERNAL.NOTIFICATION 2020-09-16 19:29:33.693 [373] TRACE o.a.a.a.c.p.i.BindingsImpl - Routing message CoreMessage[messageID=6442618526,durable=true,userID=91472afa-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:33 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2355,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTING_TYPE=1]]@1118152119 on binding=BindingsImpl [name=APP.INTERNAL.NOTIFICATION] current context::RoutingContextImpl(Address=APP.INTERNAL.NOTIFICATION, routingType=ANYCAST, PreviousAddress=null previousRoute:null, reusable=null, version=0) .................................................. 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.s.c.i.RemoteQueueBindingImpl - Adding remoteQueue ID = 2290 into message=CoreMessage[messageID=6442618526,durable=true,userID=91472afa-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:33 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2546,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@1118152119 store-forward-queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=aed43866-f805-11ea-87b6-02420a6c4bd7], temp=false]@24359ac5 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.p.i.PostOfficeImpl - Message after routed=CoreMessage[messageID=6442618526,durable=true,userID=91472afa-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:33 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2546,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@1118152119 RoutingContextImpl(Address=APP.INTERNAL.NOTIFICATION, routingType=ANYCAST, PreviousAddress=APP.INTERNAL.NOTIFICATION previousRoute:ANYCAST, reusable=false, version=-2147483601) .................................................. ***** durable queues $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2: - queueID=2045 address:$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 name:$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 filter:null ***** non durable for $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2: .................................................. 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.p.i.PagingManagerImpl - Adding pageTransaction 6442618520 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.r.ReplicatedJournal - Append record txID=6442618527 recordType = 41 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.j.i.JournalImpl - scheduling appendAddRecordTransactional:txID=6442618520,id=6442618527, userRecordType=41, record = PageCountRecordInc [queueID=2045, value=1, persistentSize=2644] 2020-09-16 19:29:33.694 [373] TRACE o.a.a.a.c.p.i.PagingStoreImpl - Paging message PagedMessageImpl [queueIDs=[2045], transactionID=6442618520, message=CoreMessage[messageID=6442618526,durable=true,userID=91472afa-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:33 IST 2020,expiration=0, durable=true, address=$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2,size=2644,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@1118152119] on pageStore $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 pageNr=238 2020-09-16 19:29:33.695 [373] INFO c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 6442618526, address: $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2, correlationId: null, props: [TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=91430c47-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]], body:{}, session name: 91441db9-f84a-11ea-96f2-0242ac170002, session connection: [clientId: null, remoteAddress: /10.74.225.79:39478], result: OK If paging is not active the sequence looks a little bit different: [paging not enabled] 2020-09-16 19:29:32.585 [407] TRACE o.a.a.a.c.s.i.ServerSessionImpl - send(message=CoreMessage[messageID=6442618510,durable=true,userID=909e67d4-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:32 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2355,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTING_TYPE=1]]@759241047, direct=true) being called 2020-09-16 19:29:32.585 [407] TRACE o.a.a.a.c.s.i.SecurityStoreImpl - checking access permissions to APP.INTERNAL.NOTIFICATION 2020-09-16 19:29:32.586 [407] TRACE o.a.a.a.c.p.i.BindingsImpl - Routing message CoreMessage[messageID=6442618510,durable=true,userID=909e67d4-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:32 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2355,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTING_TYPE=1]]@759241047 on binding=BindingsImpl [name=APP.INTERNAL.NOTIFICATION] current context::RoutingContextImpl(Address=APP.INTERNAL.NOTIFICATION, routingType=ANYCAST, PreviousAddress=null previousRoute:null, reusable=null, version=0) .................................................. 2020-09-16 19:29:32.586 [407] TRACE o.a.a.a.c.s.c.i.RemoteQueueBindingImpl - Adding remoteQueue ID = 2290 into message=CoreMessage[messageID=6442618510,durable=true,userID=909e67d4-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:32 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2546,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@759241047 store-forward-queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::serverUUID=aed43866-f805-11ea-87b6-02420a6c4bd7], temp=false]@24359ac5 2020-09-16 19:29:32.586 [407] TRACE o.a.a.a.c.p.i.PostOfficeImpl - Message after routed=CoreMessage[messageID=6442618510,durable=true,userID=909e67d4-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:32 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2546,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@759241047 RoutingContextImpl(Address=APP.INTERNAL.NOTIFICATION, routingType=ANYCAST, PreviousAddress=APP.INTERNAL.NOTIFICATION previousRoute:ANYCAST, reusable=false, version=-2147483601) .................................................. ***** durable queues $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2: - queueID=2045 address:$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 name:$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 filter:null ***** non durable for $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2: .................................................. 2020-09-16 19:29:32.587 [407] WARN o.a.a.a.c.server - AMQ222038: Starting paging on address '$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2'; size is currently: 47,948 bytes; max-size-bytes: 200; global-size-bytes: 15,505 2020-09-16 19:29:32.587 [407] TRACE o.a.a.a.c.r.ReplicatedJournal - Append record txID=6442618510 recordType = 45 2020-09-16 19:29:32.587 [407] TRACE o.a.a.a.c.j.i.JournalImpl - scheduling appendAddRecordTransactional:txID=6442618506,id=6442618510, userRecordType=45, record = CoreMessage[messageID=6442618510,durable=true,userID=909e67d4-f84a-11ea-96f2-0242ac170002,priority=4, timestamp=Wed Sep 16 19:29:32 IST 2020,expiration=0, durable=true, address=APP.INTERNAL.NOTIFICATION,size=2546,properties=TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]]@759241047 2020-09-16 19:29:32.587 [407] TRACE o.a.a.a.c.r.ReplicatedJournal - AppendUpdateRecord txid=6442618506 id = 6442618510 , recordType = 32 2020-09-16 19:29:32.588 [407] TRACE o.a.a.a.c.j.i.JournalImpl - scheduling appendUpdateRecordTransactional::txID=6442618506,id=6442618510, userRecordType=32, record = QueueEncoding [queueID=2045] 2020-09-16 19:29:32.589 [407] INFO c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 6442618510, address: APP.INTERNAL.NOTIFICATION, correlationId: null, props: [TypedProperties[messageType=com.company.app.infrastructure.notification.NotificationData,__AMQ_CID=9097d821-f84a-11ea-96f2-0242ac170002,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2=[0000 0000 0000 08F2),bytesAsLongs(2290],_AMQ_ROUTING_TYPE=1]], body: {}, session name: 9098e993-f84a-11ea-96f2-0242ac170002, session connection: [clientId: null, remoteAddress: /10.74.225.79:39477], result: OK Comparing last lines we can see that final address set for the already paged message is modified to point to bridge queue. On masterB that msg is then ignored (NO_BINDINGS) due to invalid address '$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2'. Our conclusion is that paging must not be configured for bridge queues - thus we should not define any paging on default address-settings. Q1) Can you confirm this is might be a case? Could paging for these queues have this type of impact? Separate to previous, looking on $.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2 queue metrics during node being up for some time, we observe its size constantly growing: artemis_delivering_persistent_size{address="$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2",queue="$.artemis.internal.sf.artemis-cluster.9d2035d0-f805-11ea-84d8-02428d02b4b2",} 651211.0 However testing confirms we don't lose any messages. Q2) Could there be an issue on this metric, or might this be some messaging not getting delivered? Paul -----Original Message----- From: Paul Whelan Sent: Wednesday 16 September 2020 10:04 To: users@activemq.apache.org Subject: RE: Problem with messages going missing on queues Hey Justin Detail of failure A single producer (gatling) was connected to Master 1 IP 10.70.120.243 PORT 61616 A single consumer (gatling) was connected to Master 2 IP 10.70.120.221 PORT 61616 The messages was being redirected to Master 2 IP 10.70.120.221 PORT 61616 as it had this single consumer. I'll try and get a minimal setup to reproduce the problem. That’s half my problem I don’t know how to reproduce the problem on demand. When the nodes are first setup they work great, at some point it starts to not work and what causes this is still unknown I include the broker.xml for mom1 master. When things are working we don’t see these NO_BINDING messages, it seems to be only present when messages are being dropped/ignored. Re email for slack yes that email please. Here's the Master 1 IP 10.70.120.243 broker.xml. I can share the others if that would be useful they all look similar with section ha-policy slightly changed for slaves. == <?xml version='1.0'?> <configuration xmlns="urn:activemq" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xi="http://www.w3.org/2001/XInclude" xsi:schemaLocation="urn:activemq /schema/artemis-configuration.xsd"> <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq:core "> <name>MOM-MASTER-MOM1</name> <persistence-enabled>true</persistence-enabled> <bindings-directory>/opt/xyz/data/bindings</bindings-directory> <journal-directory>/opt/xyz/data/journal</journal-directory> <large-messages-directory>/opt/xyz/data/largemessages</large-messages-directory> <paging-directory>/opt/xyz/data/paging</paging-directory> <journal-type>ASYNCIO</journal-type> <journal-datasync>true</journal-datasync> <journal-min-files>2</journal-min-files> <journal-pool-files>10</journal-pool-files> <journal-device-block-size>4096</journal-device-block-size> <journal-file-size>10M</journal-file-size> <!-- This value was determined through a calculation. Your system could perform 12.5 writes per millisecond on the current journal configuration. That translates as a sync write every 80000 nanoseconds. Note: If you specify 0 the system will perform writes directly to the disk. We recommend this to be 0 if you are using journalType=MAPPED and journal-datasync=false. --> <journal-buffer-timeout>80000</journal-buffer-timeout> <!-- When using ASYNCIO, this will determine the writing queue depth for libaio. --> <journal-max-io>4096</journal-max-io> <!-- how often we are looking for how many bytes are being used on the disk in ms --> <disk-scan-period>5000</disk-scan-period> <!-- once the disk hits this limit the system will block, or close the connection in certain protocols that won't support flow control. --> <max-disk-usage>99</max-disk-usage> <!-- should the broker detect dead locks and other issues --> <critical-analyzer>true</critical-analyzer> <critical-analyzer-timeout>120000</critical-analyzer-timeout> <critical-analyzer-check-period>60000</critical-analyzer-check-period> <critical-analyzer-policy>HALT</critical-analyzer-policy> <graceful-shutdown-enabled>true</graceful-shutdown-enabled> <graceful-shutdown-timeout>5</graceful-shutdown-timeout> <!-- The system will enter into page mode once you hit this limit. This is an estimate in bytes of how much the messages are using in memory The system will use half of the available memory (-Xmx) by default for the global-max-size. You may specify a different value here if you need to customize it to your needs. <global-max-size>100Mb</global-max-size> --> <!-- Connectors --> <connectors> <connector name="netty-connector">tcp://mom1.sent.local:61616</connector> </connectors> <!-- Acceptors --> <acceptors> <acceptor name="netty-acceptor">tcp://mom1.sent.local:61616</acceptor> </acceptors> <ha-policy> <replication> <master/> <group-name>A</group-name> <check-for-live-server>true</check-for-live-server> </replication> </ha-policy> <broadcast-groups> <broadcast-group name="artemis-broadcast-group"> <group-address>231.7.7.7</group-address> <group-port>9876</group-port> <broadcast-period>100</broadcast-period> <connector-ref>netty-connector</connector-ref> </broadcast-group> </broadcast-groups> <discovery-groups> <discovery-group name="artemis-discovery-group"> <group-address>231.7.7.7</group-address> <group-port>9876</group-port> <refresh-timeout>10000</refresh-timeout> </discovery-group> </discovery-groups> <security-enabled>true</security-enabled> <cluster-user>user</cluster-user> <cluster-password>password</cluster-password> <cluster-connections> <cluster-connection name="artemis-cluster"> <connector-ref>netty-connector</connector-ref> <retry-interval>500</retry-interval> <use-duplicate-detection>true</use-duplicate-detection> <message-load-balancing>ON_DEMAND</message-load-balancing> <max-hops>1</max-hops> <discovery-group-ref discovery-group-name="artemis-discovery-group"/> </cluster-connection> </cluster-connections> <security-settings> <security-setting match="#"> <permission type="createNonDurableQueue" roles="amq"/> <permission type="deleteNonDurableQueue" roles="amq"/> <permission type="createDurableQueue" roles="amq"/> <permission type="deleteDurableQueue" roles="amq"/> <permission type="createAddress" roles="amq"/> <permission type="deleteAddress" roles="amq"/> <permission type="consume" roles="amq"/> <permission type="browse" roles="amq"/> <permission type="send" roles="amq"/> <!-- we need this otherwise ./artemis data imp wouldn't work --> <permission type="manage" roles="amq"/> </security-setting> </security-settings> <address-settings> <address-setting match="activemq.#"> <dead-letter-address>DLQ</dead-letter-address> <expiry-address>ExpiryQueue</expiry-address> <redelivery-delay>500</redelivery-delay> <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier> <max-redelivery-delay>50000</max-redelivery-delay> <max-delivery-attempts>10</max-delivery-attempts> <redistribution-delay>1000</redistribution-delay> <max-size-bytes>10485760</max-size-bytes> <page-size-bytes>2097152</page-size-bytes> <message-counter-history-day-limit>10</message-counter-history-day-limit> <address-full-policy>PAGE</address-full-policy> <auto-create-queues>true</auto-create-queues> <auto-create-addresses>true</auto-create-addresses> <auto-create-jms-queues>true</auto-create-jms-queues> <auto-create-jms-topics>true</auto-create-jms-topics> </address-setting> <address-setting match="ActiveMQ.Advisory.#"> <dead-letter-address>DLQ</dead-letter-address> <expiry-address>ExpiryQueue</expiry-address> <redelivery-delay>500</redelivery-delay> <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier> <max-redelivery-delay>50000</max-redelivery-delay> <max-delivery-attempts>10</max-delivery-attempts> <redistribution-delay>1000</redistribution-delay> <max-size-bytes>10485760</max-size-bytes> <page-size-bytes>2097152</page-size-bytes> <message-counter-history-day-limit>10</message-counter-history-day-limit> <address-full-policy>PAGE</address-full-policy> <auto-create-queues>true</auto-create-queues> <auto-delete-queues>true</auto-delete-queues> <auto-create-addresses>true</auto-create-addresses> <auto-delete-addresses>true</auto-delete-addresses> </address-setting> <address-setting match="#"> <dead-letter-address>DLQ</dead-letter-address> <expiry-address>ExpiryQueue</expiry-address> <redelivery-delay>500</redelivery-delay> <redelivery-delay-multiplier>1.5</redelivery-delay-multiplier> <max-redelivery-delay>50000</max-redelivery-delay> <max-delivery-attempts>10</max-delivery-attempts> <redistribution-delay>1000</redistribution-delay> <max-size-bytes>10485760</max-size-bytes> <page-size-bytes>2097152</page-size-bytes> <message-counter-history-day-limit>10</message-counter-history-day-limit> <address-full-policy>PAGE</address-full-policy> <auto-create-queues>true</auto-create-queues> <auto-delete-queues>false</auto-delete-queues> <auto-create-addresses>true</auto-create-addresses> <auto-delete-addresses>false</auto-delete-addresses> <send-to-dla-on-no-route>true</send-to-dla-on-no-route> </address-setting> </address-settings> <addresses> <address name="DLQ"> <anycast> <queue name="DLQ"/> </anycast> </address> <address name="ExpiryQueue"> <anycast> <queue name="ExpiryQueue"/> </anycast> </address> </addresses> </core> </configuration> == Thanks Paul ------------------ Can you elaborate on the use-case a bit? What clients are connected to which nodes? Also, what's the minimal environment you need to reproduce the issue? It looks like maybe you could reproduce this with just 2 live brokers in a cluster and using the Artemis CLI to send & receive messages, but without more details about your use-case it's hard to tell. An automated reproducer or at least steps to reproduce would be great. You need to be specifically invited to the Apache slack in order to join. Is the email address you sent the original message from the one you'd want to use to receive the invite? Justin On Tue, Sep 15, 2020 at 9:23 AM Paul Whelan <paul.whe...@sentenial.com> wrote: > Technical Details > > Machine 1 mom1 > Master 1 IP 10.70.120.243 PORT 61616 > Slave 2 IP 10.70.120.243 PORT 62626 > Machine 2 mom2 > Master 2 IP 10.70.120.221 PORT 61616 > Slave 1 IP 10.70.120.221 PORT 62626 > > Machine 3 Jenkins Slave > IP 10.74.175.58 > > Spring framework application > Using spring boot version 2.3.0.RELEASE An instance of a master and > slave are on each install Packaged as a docker container. > Slave on machine 1 is paired with master on machine 2 Slave on machine > 2 is paired with master on machine 1 > > Using gatling to performance test the install we noticed missing messages. > Connection string (tcp://mom01:61616,tcp:// > mom02:61616)?ha=true&reconnectAttempts=-1 > > To aid with debugging restricted the test to just sending in one > message, here are the logs associated with this trial run. > Both master nodes are operational. > > # msg1 (correct) > # node 1 mom1 > 2020-09-14 20:45:27.864 [386] TRACE o.a.a.a.c.s.i.ServerSessionImpl - > send(message=CoreMessage[messageID=2781115,durable=true,userID=d6f3033 > c-f6c2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=497,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9 > c-9b6789948b7d,JMSType=test_jms_type]]@861485015, > direct=true) being called > 2020-09-14 20:45:27.865 [386] TRACE > o.a.a.a.c.s.c.i.RemoteQueueBindingImpl > - Adding remoteQueue ID = 2044128 into > message=CoreMessage[messageID=2781115,durable=true,userID=d6f3033c-f6c > 2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=688,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9 > c-9b6789948b7d,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5 > d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@861485015 > store-forward-queue=QueueImpl[name=$.artemis.internal.sf.artemis-clust > er.50f5d790-f449-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 > 2020-09-14 20:45:27.866 [386] INFO > c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 2781115, > address: GATLING_JMS_TEST_IN, correlationId: > 6c54e907-d5d6-492c-8a9c-9b6789948b7d, props: > [TypedProperties[__AMQ_CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9c-9b6789948b7d,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]], body: [ h > e l l o ]}, session name: d6ea028b-f6c2-11ea-a9da-0050569d3ae9, > session > connection: [clientId: null, remoteAddress: /10.74.175.58:41692], result: > OK > 2020-09-14 20:45:27.873 [13922] DEBUG o.a.a.a.c.s.i.QueueImpl - > QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11e > a-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 doing deliver. messageReferences=0 > 2020-09-14 20:45:27.874 [13922] TRACE o.a.a.a.c.s.i.QueueImpl - Queue > $.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed44 > 8895 > is delivering reference > Reference[2781115]:RELIABLE:CoreMessage[messageID=2781115,durable=true > ,userID=d6f3033c-f6c2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=688,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9 > c-9b6789948b7d,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5 > d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@861485015 > 2020-09-14 20:45:27.880 [13922] TRACE o.a.a.a.c.s.c.i.BridgeImpl - > Bridge ClusterConnectionBridge@3e3abc7e > [name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-02 > 42ed448895, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 targetConnector=ServerLocatorImpl > (identity=(Cluster-connection-bridge::ClusterConnectionBridge@3e3abc7e > [name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-02 > 42ed448895, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 targetConnector=ServerLocatorImpl > [initialConnectors=[TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) > ?port=61616&host=mom02-sent-local], > discoveryGroupConfiguration=null]]::ClusterConnectionImpl@772166315[no > deUUID=89ec03ba-f448-11ea-9ce1-0242b5cd1d55, > connector=TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) ?port=61616&host=mom01-sent-local, address=, > server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5cd > 1d55])) > [initialConnectors=[TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) ?port=61616&host=mom02-sent-local], > discoveryGroupConfiguration=null]] is handling > reference=Reference[2781115]:RELIABLE:CoreMessage[messageID=2781115,du > rable=true,userID=d6f3033c-f6c2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=688,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9 > c-9b6789948b7d,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5 > d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@861485015 > > Messages seem to have arrived into node1 mom1 and they are being > transported over to node2 mom2 > > # node 2 mom2 > 2020-09-14 20:45:27.890 [352] INFO > c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 2169502, > address: GATLING_JMS_TEST_IN, correlationId: > 6c54e907-d5d6-492c-8a9c-9b6789948b7d, props: > [TypedProperties[__AMQ_CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9c-9b6789948b7d,JMSType=test_jms_type]], > body: [ h e l l o ]}, session name: > 5644a321-f449-11ea-9ce1-0242b5cd1d55, session connection: [clientId: > null, > remoteAddress: /10.70.120.243:37768], result: OK > 2020-09-14 20:45:27.893 [324] INFO > c.s.a.s.SL4JLoggingActiveMQServerPlugin - delivered message [ID: > 2169502, > address: GATLING_JMS_TEST_IN, correlationId: > 6c54e907-d5d6-492c-8a9c-9b6789948b7d, props: > [TypedProperties[__AMQ_CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=6c54e907-d5d6-492c-8a9c-9b6789948b7d,JMSType=test_jms_type]], > body: [ h e l l o ], to consumer [connectionClientId: null, > remoteAddres: /10.74.175.58:59048, queueName: GATLING_JMS_TEST_IN, > sessionName: d6a9c537-f6c2-11ea-a9da-0050569d3ae9] > # msg2 (NO_BINDINGS) > 2020-09-14 20:45:27.969 [405] TRACE o.a.a.a.c.s.i.ServerSessionImpl - > send(message=CoreMessage[messageID=2781126,durable=true,userID=d702e1c > 0-f6c2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=497,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948 > a-fa387ea61b41,JMSType=test_jms_type]]@660678797, > direct=true) being called > 2020-09-14 20:45:27.969 [405] TRACE > o.a.a.a.c.s.c.i.RemoteQueueBindingImpl > - Adding remoteQueue ID = 2044128 into > message=CoreMessage[messageID=2781126,durable=true,userID=d702e1c0-f6c > 2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=GATLING_JMS_TEST_IN,size=688,properties=TypedProperties[__AMQ_ > CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JM > S_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948 > a-fa387ea61b41,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5 > d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@660678797 > store-forward-queue=QueueImpl[name=$.artemis.internal.sf.artemis-clust > er.50f5d790-f449-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 > 2020-09-14 20:45:27.971 [405] INFO > c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 2781126, > address: > $.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed44 > 8895, > correlationId: d48d23aa-dfb7-46da-948a-fa387ea61b41, props: > [TypedProperties[__AMQ_CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948a-fa387ea61b41,_AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed448895=[0000 > 0000 001F 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]], body: [ h > e l l o ]}, session name: d701a93f-f6c2-11ea-a9da-0050569d3ae9, > session > connection: [clientId: null, remoteAddress: /10.74.175.58:41692], result: > OK > 2020-09-14 20:45:27.975 [13901] DEBUG o.a.a.a.c.s.i.QueueImpl - > QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11e > a-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 doing deliver. messageReferences=0 > 2020-09-14 20:45:27.975 [13901] TRACE o.a.a.a.c.s.i.QueueImpl - Queue > $.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed44 > 8895 is delivering reference PagedReferenceImpl > [position=PagePositionImpl [pageNr=2818, messageNr=0, recordID=-1, > fileOffset=0], message=PagedMessageImpl [queueIDs=[4403], > transactionID=-1, > message=CoreMessage[messageID=2781126,durable=true,userID=d702e1c0-f6c > 2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593- > 0242ed448895,size=798,properties=TypedProperties[__AMQ_CID=d6d51af8-f6 > c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ > _ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948a-fa387ea61b41, > _AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea- > a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@660678797], > deliveryTime=0, persistedCount=0, deliveryCount=0, > subscription=PageSubscriptionImpl [cursorId=4403, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408, filter = null]] > 2020-09-14 20:45:27.975 [13901] TRACE o.a.a.a.c.s.c.i.BridgeImpl - > Bridge ClusterConnectionBridge@3e3abc7e > [name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-02 > 42ed448895, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 targetConnector=ServerLocatorImpl > (identity=(Cluster-connection-bridge::ClusterConnectionBridge@3e3abc7e > [name=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-02 > 42ed448895, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408 targetConnector=ServerLocatorImpl > [initialConnectors=[TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) > ?port=61616&host=mom02-sent-local], > discoveryGroupConfiguration=null]]::ClusterConnectionImpl@772166315[no > deUUID=89ec03ba-f448-11ea-9ce1-0242b5cd1d55, > connector=TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) ?port=61616&host=mom01-sent-local, address=, > server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5cd > 1d55])) > [initialConnectors=[TransportConfiguration(name=netty-connector, > factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConn > ectorFactory) ?port=61616&host=mom02-sent-local], > discoveryGroupConfiguration=null]] is handling > reference=PagedReferenceImpl [position=PagePositionImpl [pageNr=2818, > messageNr=0, recordID=-1, fileOffset=0], message=PagedMessageImpl > [queueIDs=[4403], transactionID=-1, > message=CoreMessage[messageID=2781126,durable=true,userID=d702e1c0-f6c > 2-11ea-a9da-0050569d3ae9,priority=4, > timestamp=Mon Sep 14 20:45:27 IST 2020,expiration=0, durable=true, > address=$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593- > 0242ed448895,size=798,properties=TypedProperties[__AMQ_CID=d6d51af8-f6 > c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ > _ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948a-fa387ea61b41, > _AMQ_ROUTE_TO$.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea- > a593-0242ed448895=[0000 > 0000 001F > 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]]@660678797], > deliveryTime=0, persistedCount=0, deliveryCount=0, > subscription=PageSubscriptionImpl [cursorId=4403, > queue=QueueImpl[name=$.artemis.internal.sf.artemis-cluster.50f5d790-f4 > 49-11ea-a593-0242ed448895, > postOffice=PostOfficeImpl > [server=ActiveMQServerImpl::serverUUID=89ec03ba-f448-11ea-9ce1-0242b5c > d1d55], > temp=false]@32961408, filter = null]] > 2020-09-14 20:45:27.980 [378] INFO > c.s.a.s.SL4JLoggingActiveMQServerPlugin - sent message [ID: 2169509, > address: > $.artemis.internal.sf.artemis-cluster.50f5d790-f449-11ea-a593-0242ed44 > 8895, > correlationId: d48d23aa-dfb7-46da-948a-fa387ea61b41, props: > [TypedProperties[__AMQ_CID=d6d51af8-f6c2-11ea-a9da-0050569d3ae9,JMSReplyTo=queue://GATLING_JMS_TEST_OUT,_AMQ_ROUTING_TYPE=1,JMSCorrelationID=d48d23aa-dfb7-46da-948a-fa387ea61b41,_AMQ_ROUTE_TO=[0000 > 0000 001F 30E0),bytesAsLongs(2044128],JMSType=test_jms_type]], body: [ h > e l l o ]}, session name: 5644a321-f449-11ea-9ce1-0242b5cd1d55, > session > connection: [clientId: null, remoteAddress: /10.70.120.243:37768], > result: NO_BINDINGS > > The message goes missing due to the NO_BINDINGS. > How to resolve this issue? > > > PS: How do I sign up for the slack channel without an apache.org email > address? > Thanks > Paul > The views and opinions expressed in this email may not reflect the > views and opinions of any member of Sentenial Limited. The information > contained in this message is confidential and may also be privileged. > It is intended only for the addressee named above. The unauthorised > use, disclosure, copying or alteration of this message is strictly > prohibited. If you are not the addressee (or responsible for delivery > of the message to the addressee), please notify the originator > immediately by return message and destroy the original message. This > message and any attachments have been scanned for viruses prior to > leaving our network. However, we do not guarantee the security of this > message and will not be responsible for any damages arising as a > result of any virus being passed on or arising from any alteration of > this message by a third party. We may monitor emails sent to and from our > network. Sentenial Limited; Registered in Ireland, No. > 374137. Registered Office: Unit 16F, Maynooth Business Campus, > Maynooth, Co Kildare, Ireland. > The views and opinions expressed in this email may not reflect the views and opinions of any member of Sentenial Limited. The information contained in this message is confidential and may also be privileged. It is intended only for the addressee named above. The unauthorised use, disclosure, copying or alteration of this message is strictly prohibited. If you are not the addressee (or responsible for delivery of the message to the addressee), please notify the originator immediately by return message and destroy the original message. This message and any attachments have been scanned for viruses prior to leaving our network. However, we do not guarantee the security of this message and will not be responsible for any damages arising as a result of any virus being passed on or arising from any alteration of this message by a third party. We may monitor emails sent to and from our network. Sentenial Limited; Registered in Ireland, No. 374137. Registered Office: Unit 16F, Maynooth Business Campus, Maynooth, Co Kildare, Ireland.