Excluding tcp-connectors and leaving invm-connectors to the ha-policy I'm 
seeing the following behavior after server0 has been shutdown and restarted:

Server0 logs in an infinite loop:

...
2018-03-14 11:04:56,976 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-14 11:04:56,981 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221060: Sending quorum vote request to localhost/127.0.0.1:61618: 
RequestBackupVote [backupsSize=-1, nodeID=null, backupAvailable=false]
2018-03-14 11:04:56,983 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221061: Received quorum vote response from localhost/127.0.0.1:61618: 
RequestBackupVote [backupsSize=1, nodeID=82925fbd-275e-11e8-bff4-0a0027000011, 
backupAvailable=false]
...

Server1 logs in an infinite loop:

...
2018-03-14 11:04:51,982 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221062: Received quorum vote request: RequestBackupVote [backupsSize=-1, 
nodeID=null, backupAvailable=false]
2018-03-14 11:04:51,983 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221063: Sending quorum vote response: RequestBackupVote [backupsSize=1, 
nodeID=82925fbd-275e-11e8-bff4-0a0027000011, backupAvailable=false]
...

Why is there an endless unsuccessful backup voting taking place with backupsize 
-1 and null nodeid?

Best regards,
- Ilkka

-----Original Message-----
From: Ilkka Virolainen [mailto:ilkka.virolai...@bitwise.fi] 
Sent: 13. maaliskuuta 2018 16:46
To: users@activemq.apache.org
Subject: RE: Artemis 2.5.0 - Problems with colocated scaledown

A part of my problem was on the client side but the scaledown issue is still 
unresolved. It would seem that client connectivity issues are related to the 
scaledown issues: to replicate the client connectivity problem: start both 
brokers, then connect with 1.5.4 client using tcp://localhost:61616 and send a 
message to a topic. Now shutdown server0. It scales down to server1. Trying to 
send a message from the client now fails even though a failover should've 
occurred. Restarting server0 results in the infinite vote for backup quorum.

Could I get clarification on whether the fault is with the broker 
configurations (ref. [1]) or is this an issue with Artemis? I'm aiming for a 
symmetrical statically defined cluster of two nodes, each storing a backup of 
each other's data and when one is shut down, the data should be made available 
for the remaining live broker and clients should failover to it. When the other 
broker is brought back online, the replication should continue normally. 

Documentation and examples give the impression that in-vm connectors/acceptors 
are needed for the scaledown and synchronization between a slave storing the 
backup and the colocated live master that the backup would be scaled down to. 
In any case, so far I've been unable to resolve these issues I've been having 
by trying out different HA options.

Best regards,
- Ilkka

[1] Reference broker configuration 
https://github.com/ilkkavi/activemq-artemis/tree/scaledown-issue/issues/IssueExample/src/main/resources/activemq

-----Original Message-----
From: Ilkka Virolainen [mailto:ilkka.virolai...@bitwise.fi] 
Sent: 9. maaliskuuta 2018 14:21
To: users@activemq.apache.org
Subject: Artemis 2.5.0 - Problems with colocated scaledown

Hello,

I have some issues with scaledown of colocated servers. I have a symmetric 
statically defined cluster of two colocated nodes configured with scale down. 
The situation occurs thus:

1. Start both brokers. They form a connection and replicate.

2. Close server1
-> Server shuts down, server0 detects the shutdown and scales down from 
replicated backup.

3. Start server1
-->
Server0 logs:
2018-03-09 10:57:57,434 WARN  [org.apache.activemq.artemis.core.server] 
AMQ222138: Local Member is not set at on ClusterConnection 
ClusterConnectionImpl@914942811[nodeUUID=1ed6bd4b-2377-11e8-a9e2-0a0027000011, 
connector=TransportConfiguration(name=netty-connector, 
factory=org-apache-activemq-artemis-core-remoting-impl-netty-NettyConnectorFactory)
 ?port=61616&host=localhost&activemq-passwordcodec=****, address=, 
server=ActiveMQServerImpl::serverUUID=1ed6bd4b-2377-11e8-a9e2-0a0027000011]

Server1 logs in an infinite loop:

2018-03-09 11:00:57,162 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:02,156 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:07,154 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:12,153 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:17,152 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:22,153 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:27,152 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote
2018-03-09 11:01:32,149 INFO  [org.apache.activemq.artemis.core.server] 
AMQ221066: Initiating quorum vote: RequestBackupQuorumVote ...

The situation only normalizes when server1 is shut down and restarted.

Broker configurations for replicating: 
https://github.com/ilkkavi/activemq-artemis/tree/scaledown-issue/issues/IssueExample/src/main/resources/activemq

I also have a separate issue that I've so far been unable to replicate locally. 
When the brokers deployed on two different physical servers, after one node 
shuts down, the other stops accepting connections. Clients attempting 
connections log : 
org.apache.activemq.artemis.api.core.ActiveMQConnectionTimedOutException: 
AMQ119013: Timed out waiting to receive cluster topology. Group:null

I don't really understand why this is happening or why it doesn't happen 
locally. The cluster topology should be known already for everyone involved. I 
understand that it's difficult to comment on this as there's no means of 
replicating this but maybe it's a situation someone has come across before?

Best regards,
- Ilkka

Reply via email to