[
https://issues.apache.org/jira/browse/ARTEMIS-3831?focusedWorklogId=899094&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-899094
]
ASF GitHub Bot logged work on ARTEMIS-3831:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 11/Jan/24 04:16
Start Date: 11/Jan/24 04:16
Worklog Time Spent: 10m
Work Description: jbertram opened a new pull request, #4739:
URL: https://github.com/apache/activemq-artemis/pull/4739
If both scale-down and cluster-connection are using the same JGroups
discovery-group then when the cluster-connection stops it will close the
underlying org.jgroups.JChannel and when the scale-down process tries to use it
to find a server it will fails.
This commit ensures that the JGroupsBroadcastEndpoint implementation of
BroadcastEndpoint#openClient initializes the channel if it has been closed.
Issue Time Tracking
-------------------
Worklog Id: (was: 899094)
Remaining Estimate: 0h
Time Spent: 10m
> Scale-down fails when using same discovery-group used by Broker cluster
> connection
> ----------------------------------------------------------------------------------
>
> Key: ARTEMIS-3831
> URL: https://issues.apache.org/jira/browse/ARTEMIS-3831
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Components: Broker
> Affects Versions: 2.19.1, 2.31.0
> Reporter: Apache Dev
> Priority: Critical
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Using 2 Live brokers in cluster.
> Both having the following HA Policy:
> {code}
> <ha-policy>
> <live-only>
> <scale-down>
> <enabled>true</enabled>
> <discovery-group-ref
> discovery-group-name="activemq-discovery-group"/>
> </scale-down>
> </live-only>
> </ha-policy>
> {code}
> where "activemq-discovery-group" is using JGroups TCPPING:
> {code}
> <discovery-groups>
> <discovery-group name="activemq-discovery-group">
> <jgroups-file>...</jgroups-file>
> <jgroups-channel>...</jgroups-channel>
> <refresh-timeout>10000</refresh-timeout>
> </discovery-group>
> </discovery-groups>
> {code}
> and it is used by the cluster of 2 brokers:
> {code}
> <cluster-connections>
> <cluster-connection name="activemq-cluster">
> <connector-ref>netty-connector</connector-ref>
> <retry-interval>5000</retry-interval>
> <use-duplicate-detection>true</use-duplicate-detection>
> <message-load-balancing>OFF</message-load-balancing>
> <max-hops>1</max-hops>
> <discovery-group-ref
> discovery-group-name="activemq-discovery-group"/>
> </cluster-connection>
> </cluster-connections>
> {code}
> Issue is that when shutdown happens, scale-down fails:
> {code}
> org.apache.activemq.artemis.core.server W AMQ222181:
> Unable to scaleDown messages
> ActiveMQInternalErrorException[errorType=INTERNAL_ERROR
> message=AMQ219004: Failed to initialise session factory]
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272)
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655)
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554)
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533)
> at
> org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85)
> at
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146)
> at
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114)
> at
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468)
> at
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250)
> at
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166)
> at
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150)
> ...
> Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR
> message=channel is closed]
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286)
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268)
> ... 44 more
> Caused by: java.lang.IllegalStateException: channel is closed
> at org.jgroups.JChannel.checkClosed(JChannel.java:957)
> at org.jgroups.JChannel._preConnect(JChannel.java:548)
> at org.jgroups.JChannel.connect(JChannel.java:288)
> at org.jgroups.JChannel.connect(JChannel.java:279)
> at
> org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126)
> at
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113)
> at
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91)
> at
> org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111)
> at
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284)
> ... 45 more
> {code}
> JGroups channel used by scale-down is probably the same used by broker, but
> already being closed during broker shutdown itself.
> As a workaround, it is possible to create a separate discovery-group (with
> its own broadcast-group) so that scale-down uses a new JGroups channel not
> being closed by broker.
> However, this causes duplication of configurations and a new JGroups port for
> the scale-down discovery must be opened.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)