Jens De Temmerman created ARTEMIS-3075:
------------------------------------------

             Summary: Scale down of activemq.notifications causes issues
                 Key: ARTEMIS-3075
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3075
             Project: ActiveMQ Artemis
          Issue Type: Bug
    Affects Versions: 2.16.0
            Reporter: Jens De Temmerman


Situation: a 2 server cluster, live-live, configured statically. Servers are 
configured to scale-down to each other when shut down.

When the servers are live, they create a queue on each other's 
"activemq.notifications" address. This is done in ClusterConnectionBridge, I 
assume this is needed to get clustering to work properly.

The queues are named 
"notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158"
 with serverUUID on server B pointing to server A and vice versa. (the UUID 
after "notif." is random).

However, when server A is shut down, in some cases messages remain on this 
queue. This may be a timing issue, but regardless, when scaling down starts the 
ScaleDownHandler finds messages:
 
{code:java}
 2021-01-19 15:32:42,780 DEBUG 
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down 
address activemq.notifications
2021-01-19 15:32:42,781 DEBUG 
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down 
messages on address activemq.notifications
2021-01-19 15:32:42,781 DEBUG 
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down 
messages on address activemq.notifications / performing loop on queue 
QueueImpl[name=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
 postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::serverUUID=bbd13595-4473-11eb-aadd-005056a1b044], 
temp=true]@64be4624
2021-01-19 15:32:42,786 DEBUG 
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Reading message 
CoreMessage[messageID=796047343,durable=true,userID=null,priority=0, 
timestamp=Tue Jan 19 15:32:42 CET 2021,expiration=0, durable=true, 
address=activemq.notifications,size=1000,properties=TypedProperties[_AMQ_Distance=0,_AMQ_ConsumerCount=0,_AMQ_User=REDACTED,_AMQ_ROUTING_TYPE=0,_AMQ_SessionName=7b37ba58-5a5d-11eb-b854-005056bd7345,_AMQ_Address=REDACTED,_AMQ_RemoteAddress=/REDACTED:42310,_AMQ_NotifTimestamp=1611066762756,_AMQ_ClusterName=2f88994d-5d37-4976-afcd-b879fb80993abbd13595-4473-11eb-aadd-005056a1b044,_AMQ_RoutingName=2f88994d-5d37-4976-afcd-b879fb80993a,_AMQ_NotifType=CONSUMER_CLOSED,_AMQ_FilterString=NULL-value]]@187982968
 from queue 
QueueImpl[name=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
 postOffice=PostOfficeImpl 
[server=ActiveMQServerImpl::serverUUID=bbd13595-4473-11eb-aadd-005056a1b044], 
temp=true]@64be4624
{code}

Of course, this queue doesn't exist on node B. But this causes this queue to be 
created on node B:

{code}
2021-01-19 15:32:42,814 DEBUG 
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Failed to get 
queue ID, creating queue [addressName=activemq.notifications, 
queueName=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
 routingType=MULTICAST, filter=_AMQ_Binding_Type<>2 AND _AMQ_NotifType IN 
('SESSION_CREATED','BINDING_ADDED','BINDING_REMOVED','CONSUMER_CREATED','CONSUMER_CLOSED','PROPOSAL','PROPOSAL_RESPONSE','UNPROPOSAL')
 AND _AMQ_Distance<1 AND (((_AMQ_Address NOT LIKE 'activemq%') AND 
(_AMQ_Address NOT LIKE '$.artemis.internal.sf.%') AND (_AMQ_Address NOT LIKE 
'activemq.management%'))) AND (_AMQ_NotifType = 'SESSION_CREATED' OR 
(_AMQ_Address NOT LIKE 'activemq.notifications%')), durable=false]
{code}

But on node B, there will never be any consumers on this queue. Since 
activemq.notifications is multicast, all messages sent to it are distributed to 
this queue as well, but never consumed nor purged (unless done manually).

I'm not sure what the correct course of action is, but somehow this queue 
should not be considered for scale-down, as it creates a situation that 
requires manual intervention to fix.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to