On Wednesday, June 15, 2011 16:26:56 mark - pacemaker list wrote: > On Wed, Jun 15, 2011 at 12:24 PM, imnotpc <imno...@rock3d.net> wrote: > > What I was thinking is that the DC is never fenced > > Is this actually the case? It would sure explain the one "gotcha" I've > never been able to work around in a three node cluster with stonith/SBD. > If you unplug the network cable from the DC (but it and the other nodes > all still see the SBD disk via their other NIC(s)), the DC of course > becomes completely isolated. It will fence one of the still good nodes > right away, and the surviving node that still has network connectivity > will become DC. So, you have two DCs, the original one which is > disconnected from the network and your newly elected one (not really > elected, just took over because it's the last host left that has network). > When the just-fenced node comes back up, you get quorum with the new DC > and your disconnected DC finally gets shot. > > For any non-DC node, you get exactly the behavior you'd expect, where > unplugging its network cable gets it fenced and everyone else stays happy. > I'd hoped for a situation where unplugging the DC would have the other two > say, "well, our DC is gone, but we can see each other so he need to be > fenced". Maybe I've just missed a necessary timeout setting somewhere to > delay the isolated DC from fencing a good node so quickly? > > Sorry, I guess that's a thread hijack, but I've looked and googled and > never anywhere been able to find something that says DCs don't get fenced, > so this has confused me for a bit. > > Regards, > Mark
Oh I wasn't making a pronouncement that DCs are always up and unique. Dejan indicates they can fail and you've shown that there can be more than one. My point was that "as designed"/conceptually there should only be one and it should always be running. I think your example does in a way make my point. If every cluster had a unique notifying agent that was always running (or immediately restarted) and you suddenly had messages from multiple agents you would immediately know what had happened as opposed to wading though a flood of mail from each node or not getting anything at all. I chose the DC as an example of an agent that would meet these needs as opposed to resources which work poorly in this role. Jeff _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker