jbertram commented on PR #4899: URL: https://github.com/apache/activemq-artemis/pull/4899#issuecomment-2299734904
I added your test to the branch with my fix, and I can see my fix detecting a problem and closing the connection, but the test still fails, and I still see messages like this: ``` WARN [org.apache.activemq.artemis.core.server] AMQ222139: MessageFlowRecordImpl [nodeID=13207315-5f2f-11ef-b63b-5c80b6f32172, connector=TransportConfiguration(name=netty-connector, factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)?port=61616&host=localhost, queueName=$.artemis.internal.sf.my-cluster.13207315-5f2f-11ef-b63b-5c80b6f32172, queue=QueueImpl[name=$.artemis.internal.sf.my-cluster.13207315-5f2f-11ef-b63b-5c80b6f32172, postOffice=PostOfficeImpl [server=ActiveMQServerImpl::name=localhost], temp=false]@3ca984c7, isClosed=false, reset=true]::Remote queue binding exampleQueue2dc45dca-5f2f-11ef-b5ea-5c80b6f32172 has already been bound in the post office. Most likely cause for this is you have a loop in your cluster due to cluster max-hops being too large or you have multiple cluster connections to the same nodes using overlapping addresses ``` Is this the kind of message you see in your K8s cluster when this problem occurs and is that what you were referring to in the Jira when you said this? > This messes things up with the cluster, the old message flow record is invalid. I reproduced this with a very simple manual test with 2 clustered nodes with persistence disabled. When I kill one node and restart it I see the `AMQ222139` message on the _other_ node. However, I resolved this by simply changing the configuration on the `cluster-connection` using: ``` <reconnect-attempts>0</reconnect-attemtps> ``` I then cherry-picked your `ZeroPersistenceSymmetricalClusterTest` test to the `main` branch. The test fails by default, but when I change the various `broker.xml` files used by that test to use `0` `reconnect-attempts` the test passes. Also, given the fact that persistence is disabled this is the configuration I would recommend. Have you considered this configuration change in your environment? It seems this would resolve your problem with no code changes necessary. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] For further information, visit: https://activemq.apache.org/contact
