[jira] [Created] (ARTEMIS-1106) Failback in Replication-HA-Mode does not work properly

Sebastian Woelk (JIRA) Mon, 10 Apr 2017 07:12:56 -0700

Sebastian Woelk created ARTEMIS-1106:
----------------------------------------


             Summary: Failback in Replication-HA-Mode does not work properly
                 Key: ARTEMIS-1106
                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1106
             Project: ActiveMQ Artemis
          Issue Type: Bug
          Components: Broker
            Reporter: Sebastian Woelk


I have a simple setup of two Artemis-Node configured to run in 
replication-HA-Mode and I want to use the failback option. According to the 
documentation [High Availablity and 
Failover|https://activemq.apache.org/artemis/docs/2.0.0/ha.html] I set the 
{{check-for-live-server}} property to {{true}} on the master node and the 
{{allow-failback}} to {{true}} on the slave node and both are configured to be 
nodes of the same cluster.
The {{cluster-user}} and the {{cluster-password}} match on both nodes.

When I start the master and the slave node for the first time, both nodes are 
working fine, the master becomes the live server and the slave runs as backup. 
When I stop the master (which is the live server), the slave node detects, that 
the live server is down and it takes over and becomes the new live server.
But when I now restart the master node, both nodes (master and slave) begin to 
log the message {{AMQ212034 There are more than one servers on the network 
broadcasting the same node id. ... }} continuously every few milliseconds.

Additionally on the slave node the message 
{code}
AMQ222216: Security problem while creating session: AMQ119031: Unable to 
validate user
{code}
is logged twice before the {{AMQ212034}} messages.
If I now stop the slave server and restart it, it again becomes the backup and 
the master node is the live, but restarting the slave-node should not be 
necessary in this situation.

To get more detailed information about the security problem logged on the slave 
node, I ran the servers logging on DEBUG-Level and found the following error 
message in the server log:
{code}
[org.apache.activemq.artemis.spi.core.security.ActiveMQJAASSecurityManager] 
Couldn't validate user: javax.security.auth.login.FailedLoginException: User is 
null
{code}

Furthermore I put a log-trace-Statement in the method 
{{org.apache.activemq.artemis.core.security.impl.SecurityStoreImp#authenticate()}}-method
 and recompiled the server, to see the user name which is passed in, and it is 
in fact {{null}}.

So I think, there is a bug in the version 2.0.0 of Artemis. The newly started 
master-server connects to an already running slave-server (which is in 
live-mode due to a previous fail-over) without any username and password. When 
I understand the documentation correctly, it should use the cluster-user and 
cluster-password from the {{broker.xml}}, but it does not.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ARTEMIS-1106) Failback in Replication-HA-Mode does not work properly

Reply via email to