[jira] [Work logged] (ARTEMIS-3831) Scale-down fails when using same discovery-group used by Broker cluster connection

ASF GitHub Bot (Jira) Wed, 10 Jan 2024 20:17:03 -0800


     [ 
https://issues.apache.org/jira/browse/ARTEMIS-3831?focusedWorklogId=899094&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-899094
 ]


ASF GitHub Bot logged work on ARTEMIS-3831:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Jan/24 04:16
            Start Date: 11/Jan/24 04:16
    Worklog Time Spent: 10m 
      Work Description: jbertram opened a new pull request, #4739:
URL: https://github.com/apache/activemq-artemis/pull/4739

   If both scale-down and cluster-connection are using the same JGroups 
discovery-group then when the cluster-connection stops it  will close the 
underlying org.jgroups.JChannel and when the scale-down process tries to use it 
to find a server it will fails.
       
   This commit ensures that the JGroupsBroadcastEndpoint implementation of 
BroadcastEndpoint#openClient initializes the channel if it has been closed.




Issue Time Tracking
-------------------

            Worklog Id:     (was: 899094)
    Remaining Estimate: 0h
            Time Spent: 10m

> Scale-down fails when using same discovery-group used by Broker cluster 
> connection
> ----------------------------------------------------------------------------------
>
>                 Key: ARTEMIS-3831
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-3831
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.19.1, 2.31.0
>            Reporter: Apache Dev
>            Priority: Critical
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Using 2 Live brokers in cluster.
> Both having the following HA Policy:
> {code}
>         <ha-policy>
>             <live-only>
>                 <scale-down>
>                     <enabled>true</enabled>
>                     <discovery-group-ref 
> discovery-group-name="activemq-discovery-group"/>
>                 </scale-down>
>             </live-only>
>         </ha-policy>
> {code}
> where "activemq-discovery-group" is using JGroups TCPPING:
> {code}
>         <discovery-groups>
>             <discovery-group name="activemq-discovery-group">
>                 <jgroups-file>...</jgroups-file>
>                 <jgroups-channel>...</jgroups-channel>
>                 <refresh-timeout>10000</refresh-timeout>
>             </discovery-group>
>         </discovery-groups>
> {code}
> and it is used by the cluster of 2 brokers:
> {code}
>         <cluster-connections>
>             <cluster-connection name="activemq-cluster">
>                 <connector-ref>netty-connector</connector-ref>
>                 <retry-interval>5000</retry-interval>
>                 <use-duplicate-detection>true</use-duplicate-detection>
>                 <message-load-balancing>OFF</message-load-balancing>
>                 <max-hops>1</max-hops>
>                 <discovery-group-ref 
> discovery-group-name="activemq-discovery-group"/>
>             </cluster-connection>
>         </cluster-connections>
> {code}
> Issue is that when shutdown happens, scale-down fails:
> {code}
> org.apache.activemq.artemis.core.server                      W AMQ222181: 
> Unable to scaleDown messages
>         ActiveMQInternalErrorException[errorType=INTERNAL_ERROR 
> message=AMQ219004: Failed to initialise session factory]
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:272)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:655)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:554)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.connect(ServerLocatorImpl.java:533)
>         at 
> org.apache.activemq.artemis.core.server.LiveNodeLocator.connectToCluster(LiveNodeLocator.java:85)
>         at 
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.connectToScaleDownTarget(LiveOnlyActivation.java:146)
>         at 
> org.apache.activemq.artemis.core.server.impl.LiveOnlyActivation.freezeConnections(LiveOnlyActivation.java:114)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.freezeConnections(ActiveMQServerImpl.java:1468)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1250)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1166)
>         at 
> org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:1150)
>         ...
>         Caused by: ActiveMQInternalErrorException[errorType=INTERNAL_ERROR 
> message=channel is closed]
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:286)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.initialize(ServerLocatorImpl.java:268)
>         ... 44 more
>         Caused by: java.lang.IllegalStateException: channel is closed
>         at org.jgroups.JChannel.checkClosed(JChannel.java:957)
>         at org.jgroups.JChannel._preConnect(JChannel.java:548)
>         at org.jgroups.JChannel.connect(JChannel.java:288)
>         at org.jgroups.JChannel.connect(JChannel.java:279)
>         at 
> org.apache.activemq.artemis.api.core.jgroups.JChannelWrapper.connect(JChannelWrapper.java:126)
>         at 
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.internalOpen(JGroupsBroadcastEndpoint.java:113)
>         at 
> org.apache.activemq.artemis.api.core.JGroupsBroadcastEndpoint.openClient(JGroupsBroadcastEndpoint.java:91)
>         at 
> org.apache.activemq.artemis.core.cluster.DiscoveryGroup.start(DiscoveryGroup.java:111)
>         at 
> org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.startDiscovery(ServerLocatorImpl.java:284)
>         ... 45 more
> {code}
> JGroups channel used by scale-down is probably the same used by broker, but 
> already being closed during broker shutdown itself.
> As a workaround, it is possible to create a separate discovery-group (with 
> its own broadcast-group) so that scale-down uses a new JGroups channel not 
> being closed by broker.
> However, this causes duplication of configurations and a new JGroups port for 
> the scale-down discovery must be opened.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Work logged] (ARTEMIS-3831) Scale-down fails when using same discovery-group used by Broker cluster connection

Reply via email to