> Tundra, > > Have you tried using a quorum server instead? It > sounds like there is > some problem either with the multi-pathing or with > the device's ability > to handle the reservation commands. >
I have not yet. Am I misunderstanding my reading of the documentation, that a quorum server is a single point of failure for the cluster (i.e. the cluster's availability won't be any higher than the quorum server's). As it stands, I'm a little surprised because my '4 node HA cluster' seems to be less tolerant of failure than a pair of 2-node clusters would seem to be - with one of the four nodes out of the cluster, a reboot of one of the remaining 3 causes the other 2 to panic and reboot. I think I understand that this is intentional to avoid a partition, but it really feels like '4 node cluster' is no more available than '3 node cluster'. If I'm going to add a 5th machine for no purpose other than to be the quorum server, would I be better off making the 5th machine a 5th node? My current experiment, which I'm working on setting up, is to have the Private Interconnect occur over physical NICs that are not shared for any other purpose. If that doesn't work, I'll try a quorum server. Either way, I'll keep this thread updated as I go. -- This message posted from opensolaris.org