Jens De Temmerman created ARTEMIS-3075:
------------------------------------------
Summary: Scale down of activemq.notifications causes issues
Key: ARTEMIS-3075
URL: https://issues.apache.org/jira/browse/ARTEMIS-3075
Project: ActiveMQ Artemis
Issue Type: Bug
Affects Versions: 2.16.0
Reporter: Jens De Temmerman
Situation: a 2 server cluster, live-live, configured statically. Servers are
configured to scale-down to each other when shut down.
When the servers are live, they create a queue on each other's
"activemq.notifications" address. This is done in ClusterConnectionBridge, I
assume this is needed to get clustering to work properly.
The queues are named
"notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158"
with serverUUID on server B pointing to server A and vice versa. (the UUID
after "notif." is random).
However, when server A is shut down, in some cases messages remain on this
queue. This may be a timing issue, but regardless, when scaling down starts the
ScaleDownHandler finds messages:
{code:java}
2021-01-19 15:32:42,780 DEBUG
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down
address activemq.notifications
2021-01-19 15:32:42,781 DEBUG
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down
messages on address activemq.notifications
2021-01-19 15:32:42,781 DEBUG
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Scaling down
messages on address activemq.notifications / performing loop on queue
QueueImpl[name=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::serverUUID=bbd13595-4473-11eb-aadd-005056a1b044],
temp=true]@64be4624
2021-01-19 15:32:42,786 DEBUG
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Reading message
CoreMessage[messageID=796047343,durable=true,userID=null,priority=0,
timestamp=Tue Jan 19 15:32:42 CET 2021,expiration=0, durable=true,
address=activemq.notifications,size=1000,properties=TypedProperties[_AMQ_Distance=0,_AMQ_ConsumerCount=0,_AMQ_User=REDACTED,_AMQ_ROUTING_TYPE=0,_AMQ_SessionName=7b37ba58-5a5d-11eb-b854-005056bd7345,_AMQ_Address=REDACTED,_AMQ_RemoteAddress=/REDACTED:42310,_AMQ_NotifTimestamp=1611066762756,_AMQ_ClusterName=2f88994d-5d37-4976-afcd-b879fb80993abbd13595-4473-11eb-aadd-005056a1b044,_AMQ_RoutingName=2f88994d-5d37-4976-afcd-b879fb80993a,_AMQ_NotifType=CONSUMER_CLOSED,_AMQ_FilterString=NULL-value]]@187982968
from queue
QueueImpl[name=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
postOffice=PostOfficeImpl
[server=ActiveMQServerImpl::serverUUID=bbd13595-4473-11eb-aadd-005056a1b044],
temp=true]@64be4624
{code}
Of course, this queue doesn't exist on node B. But this causes this queue to be
created on node B:
{code}
2021-01-19 15:32:42,814 DEBUG
[org.apache.activemq.artemis.core.server.impl.ScaleDownHandler] Failed to get
queue ID, creating queue [addressName=activemq.notifications,
queueName=notif.6dca3cd4-5a61-11eb-8600-005056a1a158.ActiveMQServerImpl_serverUUID=903148b3-4455-11eb-bfed-005056a1a158,
routingType=MULTICAST, filter=_AMQ_Binding_Type<>2 AND _AMQ_NotifType IN
('SESSION_CREATED','BINDING_ADDED','BINDING_REMOVED','CONSUMER_CREATED','CONSUMER_CLOSED','PROPOSAL','PROPOSAL_RESPONSE','UNPROPOSAL')
AND _AMQ_Distance<1 AND (((_AMQ_Address NOT LIKE 'activemq%') AND
(_AMQ_Address NOT LIKE '$.artemis.internal.sf.%') AND (_AMQ_Address NOT LIKE
'activemq.management%'))) AND (_AMQ_NotifType = 'SESSION_CREATED' OR
(_AMQ_Address NOT LIKE 'activemq.notifications%')), durable=false]
{code}
But on node B, there will never be any consumers on this queue. Since
activemq.notifications is multicast, all messages sent to it are distributed to
this queue as well, but never consumed nor purged (unless done manually).
I'm not sure what the correct course of action is, but somehow this queue
should not be considered for scale-down, as it creates a situation that
requires manual intervention to fix.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)