Keith Wall created QPID-6209:
--------------------------------

             Summary: Ensure node discoverer returns a value for every node 
even if node is partitioned
                 Key: QPID-6209
                 URL: https://issues.apache.org/jira/browse/QPID-6209
             Project: Qpid
          Issue Type: Bug
          Components: Java Broker
    Affects Versions: 0.30
            Reporter: Keith Wall
             Fix For: 0.31


The BDB HA node discoverer is used by the BDBHAVHN to determine the state of 
the other nodes in the group.  The information is presented on the UI so that 
the Operator has a complete picture of the group.

There was an issue found during the HA acceptance test that deliberately forms 
a network partition around the a master node.  It was seen that the state of 
the group when observed from the other nodes was incorrect:  the group appeared 
to have two masters - both the newly elected master and the original master. 

The underlying issue was the monitoring itself.  JE DbPing hangs indefinitely 
whilst trying to connect to the target node.  If the packets are lost (as is 
the case during our test which uses iptables DROP), then it hangs indefinitely. 
This behaviour meant the the node discoverer 
(ReplicatedEnvironmentFacade.RemoteNodeStateLearner#discoverNodeStates) 
returned no map entry for the partitioned node and thus caused the stale value 
to linger on the UI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to