Re: [ClusterLabs] Fencing with a 3-node (1 for quorum only) cluster

Dan Swartzendruber Thu, 04 Aug 2016 17:21:12 -0700

On 2016-08-04 19:33, Digimer wrote:

On 04/08/16 07:21 PM, Dan Swartzendruber wrote:
On 2016-08-04 19:03, Digimer wrote:
On 04/08/16 06:56 PM, Dan Swartzendruber wrote:
I'm setting up an HA NFS server to serve up storage to a couple of
vsphere hosts. I have a virtual IP, and it depends on a ZFSresourceagent which imports or exports a pool. So far, with stonithdisabled,
it all works perfectly.  I was dubious about a 2-node solution, so I
created a 3rd node which runs as a virtual machine on one of thehosts.
All it is for is quorum.  So, looking at fencing next.  The primary
server is a poweredge R905, which has DRAC for fencing.  The backup
storage node is a Supermicro X9-SCL-F (with IPMI). So I would beusing
the DRAC agent for the former and the ipmilan for the latter?  I was
reading about location constraints, where you tell each instance ofthefencing agent not to run on the node that would be getting fenced.So,
my first thought was to configure the drac agent and tell it not to
fence node 1, and configure the ipmilan agent and tell it not tofencenode 2. The thing is, there is no agent available for the quorumnode.Would it make more sense instead to tell the drac agent to only runon
node 2, and the ipmilan agent to only run on node 1?  Thanks!
This is a common mistake.
Fencing and quorum solve different problems and are notinterchangeable.
In short;

Fencing is a tool when things go wrong.

Quorum is a tool when things are working.
The only impact that having quorum has with regard to fencing is thatitavoids a scenario when both nodes try to fence each other and thefaster
one wins (which is itself OK). Even then, you can add 'delay=15' the
node you want to win and it will win is such a case. In the old days,itwould also prevent a fence loop if you started the cluster on bootandcomms were down. Now though, you set 'wait_for_all' and you won't geta
fence loop, so that solves that.
Said another way; Quorum is optional, fencing is not (people oftenget
that backwards).
As for DRAC vs IPMI, no, they are not two things. In fact, I amprettycertain that fence_drac is a symlink to fence_ipmilan. All DRAC is(samewith iRMC, iLO, RSA, etc) is "IPMI + features". Fundamentally, thefenceaction; rebooting the node, works via the basic IPMI standard usingthe
DRAC's BMC.

To do proper redundant fencing, which is a great idea, you want
something like switched PDUs. This is how we do it (with two node
clusters). IPMI first, and if that fails, a pair of PDUs (one foreach
PSU, each PDU going to independent UPSes) as backup.
Thanks for the quick response.  I didn't mean to give the impression
that I didn't know the different between quorum and fencing.  The only
reason I (currently) have the quorum node was to prevent a deathmatch
(which I had read about elsewhere.)  If it is as simple as adding a
delay as you describe, I'm inclined to go that route.  At least on
CentOS7, fence_ipmilan and fence_drac are not the same.  e.g. they are
both python scripts that are totally different.
The delay is perfectly fine. We've shipped dozens of two-node systems
over the last five or so years and all were 2-node and none have had
trouble. Where node failures have occurred, fencing operated properly
and services were recovered. So in my opinion, in the interest of
minimizing complexity, I recommend the two-node approach.

As for the two agents not being symlinked, OK. It still doesn't change
the core point through that both fence_ipmilan and fence_drac would be
acting on the same target.

Note; If you lose power to the mainboard (which we've seen, failed
mainboard voltage regulator did this once), you lose the IPMI (DRAC)
BMC. This scenario will leave your cluster blocked without an external
secondary fence method, like switched PDUs.

cheers


Thanks!



_______________________________________________
Users mailing list: [email protected]
http://clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Fencing with a 3-node (1 for quorum only) cluster

Reply via email to