Keith Wall created QPID-6209:
--------------------------------
Summary: Ensure node discoverer returns a value for every node
even if node is partitioned
Key: QPID-6209
URL: https://issues.apache.org/jira/browse/QPID-6209
Project: Qpid
Issue Type: Bug
Components: Java Broker
Affects Versions: 0.30
Reporter: Keith Wall
Fix For: 0.31
The BDB HA node discoverer is used by the BDBHAVHN to determine the state of
the other nodes in the group. The information is presented on the UI so that
the Operator has a complete picture of the group.
There was an issue found during the HA acceptance test that deliberately forms
a network partition around the a master node. It was seen that the state of
the group when observed from the other nodes was incorrect: the group appeared
to have two masters - both the newly elected master and the original master.
The underlying issue was the monitoring itself. JE DbPing hangs indefinitely
whilst trying to connect to the target node. If the packets are lost (as is
the case during our test which uses iptables DROP), then it hangs indefinitely.
This behaviour meant the the node discoverer
(ReplicatedEnvironmentFacade.RemoteNodeStateLearner#discoverNodeStates)
returned no map entry for the partitioned node and thus caused the stale value
to linger on the UI.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]