Sebastian Woelk created ARTEMIS-1106:
----------------------------------------
Summary: Failback in Replication-HA-Mode does not work properly
Key: ARTEMIS-1106
URL: https://issues.apache.org/jira/browse/ARTEMIS-1106
Project: ActiveMQ Artemis
Issue Type: Bug
Components: Broker
Reporter: Sebastian Woelk
I have a simple setup of two Artemis-Node configured to run in
replication-HA-Mode and I want to use the failback option. According to the
documentation [High Availablity and
Failover|https://activemq.apache.org/artemis/docs/2.0.0/ha.html] I set the
{{check-for-live-server}} property to {{true}} on the master node and the
{{allow-failback}} to {{true}} on the slave node and both are configured to be
nodes of the same cluster.
The {{cluster-user}} and the {{cluster-password}} match on both nodes.
When I start the master and the slave node for the first time, both nodes are
working fine, the master becomes the live server and the slave runs as backup.
When I stop the master (which is the live server), the slave node detects, that
the live server is down and it takes over and becomes the new live server.
But when I now restart the master node, both nodes (master and slave) begin to
log the message {{AMQ212034 There are more than one servers on the network
broadcasting the same node id. ... }} continuously every few milliseconds.
Additionally on the slave node the message
{code}
AMQ222216: Security problem while creating session: AMQ119031: Unable to
validate user
{code}
is logged twice before the {{AMQ212034}} messages.
If I now stop the slave server and restart it, it again becomes the backup and
the master node is the live, but restarting the slave-node should not be
necessary in this situation.
To get more detailed information about the security problem logged on the slave
node, I ran the servers logging on DEBUG-Level and found the following error
message in the server log:
{code}
[org.apache.activemq.artemis.spi.core.security.ActiveMQJAASSecurityManager]
Couldn't validate user: javax.security.auth.login.FailedLoginException: User is
null
{code}
Furthermore I put a log-trace-Statement in the method
{{org.apache.activemq.artemis.core.security.impl.SecurityStoreImp#authenticate()}}-method
and recompiled the server, to see the user name which is passed in, and it is
in fact {{null}}.
So I think, there is a bug in the version 2.0.0 of Artemis. The newly started
master-server connects to an already running slave-server (which is in
live-mode due to a previous fail-over) without any username and password. When
I understand the documentation correctly, it should use the cluster-user and
cluster-password from the {{broker.xml}}, but it does not.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)