Hi,

On Thu, Feb 07, 2008 at 05:00:14PM +0100, Sebastian Reitenbach wrote:
> Hi,
> 
> I have a 4 node cluster, and wanted to setup a quorum server, so that I do 
> not need three running cluster nodes to get quorum. The quorumd IP address 
> is a shared IP on another two node cluster. 
> 
> I've done the following tests, the quorumd from a 2.1.2 version of 
> heartbeat, the cluster nodes had 2.1.3 version:
> 
> 
> 
> start quorumd 
> start first cluster node -> (node becomes DC, contacting the quorum) cluster 
> gets quorm
> start second cluster node -> cluster still has quorum
> stop DC, -> see other node becoming DC, and contacting quorum server, 
> cluster still has quorum
> kill quorumd, then see RST packets going back to cluster node (the DC tries 
> to contact the quorumd every second) -> cluster still has quorum
> wait 5 minutes -> cluster still has quorum
> try to start stop a node, resource, add or remove a resource -> this works, 
> then the cluster recognizes the lost quorum

After any of these actions the cluster looses quorum? Or is it
just after the node restart?

> then restart the quorumd -> see answers going back from quorumd to DC node, 
> but cluster has no quorum again
> wait 5 minutes -> cluster still has no quorum again

I can recall that somebody else already complained about the same
issue.

> restart heartbeat on one of the cluster nodes -> cluster recognizes the 
> availablility of quorumd and gets quorum again
> 
> Setting a node to standby, does not make the cluster recognize that the 
> quorum got lost, or is available again.
> 
> I also have seen, when there is a firewall, that drops packets, instead of 
> answering with RST, when the quorumd is down, then the rate when the DC 
> tries to reconnect to the quorumd drops to about once a minute, but that is 
> OK, as I'd guess its waiting for timeouts.

Yes, looks like a TCP/IP property.

> So in my eyes, using a quorumd does more harm than being useful, but maybe I 
> did sth. wrong?

Since it has been working, you probably set it up ok. You should
open a bugzilla for this. Sorry that I can't offer more help on
the matter now.

BTW, did you also test a split brain situation where one of the
nodes can talk to the quorumd?

Thanks,

Dejan

> 
> cheers
> Sebastian
> 
> _______________________________________________
> Linux-HA mailing list
> [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to