On 09/21/2016 01:51 AM, Stefan Bauer wrote: > Hi Ken, > > let met sum it up: > > Pacemaker in recent versions is smart enough to run (trigger, execute) the > fence operation on the node, that is not the target. > > If i have an external stonith device that can fence multiple nodes, a single > primitive is enough in pacemaker. > > If with external/ipmi i can only address a single node, i need to have > multiple primitives - one for each node. > > In this case it's recommended to let the primitive always run on the opposite > node - right?
Yes, exactly :-) In terms of implementation, I'd use a +INFINITY location constraint to tie the device to the opposite node. This approach (as opposed to a -INFINITY constraint on the target node) allows the target node to run the fence device when the opposite node is unavailable. > thank you. > > Stefan > > -----Ursprüngliche Nachricht----- >> Von:Ken Gaillot <kgail...@redhat.com> >> Gesendet: Die 20 September 2016 16:49 >> An: users@clusterlabs.org >> Betreff: Re: [ClusterLabs] best practice fencing with ipmi in 2node-setups / >> cloneresource/monitor/timeout >> >> On 09/20/2016 06:42 AM, Digimer wrote: >>> On 20/09/16 06:59 AM, Stefan Bauer wrote: >>>> Hi, >>>> >>>> i run a 2 node cluster and want to be save in split-brain scenarios. For >>>> this i setup external/ipmi to stonith the other node. >>> >>> Please use 'fence_ipmilan'. I believe that the older external/ipmi are >>> deprecated (someone correct me if I am wrong on this). >> >> It's just an alternative. The "external/" agents come with the >> cluster-glue package, which isn't provided by some distributions (such >> as RHEL and its derivatives), so it's "deprecated" on those only. >> >>>> Some possible issues jumped to my mind and i would ike to find the best >>>> practice solution: >>>> >>>> - I have a primitive for each node to stonith. Many documents and guides >>>> recommend to never let them run on the host it should fence. I would >>>> setup clone resources to avoid dealing with locations that would also >>>> influence scoring. Does that make sense? >>> >>> Since v1.1.10 of pacemaker, you don't have to worry about this. >>> Pacemaker is smart enough to know where to run a fence call from in >>> order to terminate a target. >> >> Right, fence devices can run anywhere now, and in fact they don't even >> have to be "running" for pacemaker to use them -- as long as they are >> configured and not intentionally disabled, pacemaker will use them. >> >> There is still a slight advantage to not running a fence device on a >> node it can fence. "Running" a fence device in pacemaker really means >> running the recurring monitor for it. Since the node that runs the >> monitor has "verified" access to the device, pacemaker will prefer to >> use it to execute that device. However, pacemaker will not use a node to >> fence itself, except as a last resort if no other node is available. So, >> running a fence device on a node it can fence means that the preference >> is lost. >> >> That's a very minor detail, not worth worrying about. It's more a matter >> of personal preference. >> >> In this particular case, a more relevant concern is that you need >> different configurations for the different targets (the IPMI address is >> different). >> >> One approach is to define two different fence devices, each with one >> IPMI address. In that case, it makes sense to use the location >> constraints to ensure the device prefers the node that's not its target. >> >> Another approach (if the fence agent supports it) is to use >> pcmk_host_map to provide a different "port" (IPMI address) depending on >> which host is being fenced. In this case, you need only one fence device >> to be able to fence both hosts. You don't need a clone. (Remember, the >> node "running" the device merely refers to its monitor, so the cluster >> can still use the fence device, even if that node crashes.) >> >>>> - Monitoring operation on the stonith primitive is dangerous. I read >>>> that if monitor operations fail for the stonith device, stonith action >>>> is triggered. I think its not clever to give the cluster the option to >>>> fence a node just because it has an issue to monitor a fence device. >>>> That should not be a reason to shutdown a node. What is your opinion on >>>> this? Can i just set the primitive monitor operation to disabled? >>> >>> Monitoring is how you will detect that, for example, the IPMI cable >>> failed or was unplugged. I do not believe the node will get fenced on >>> fence agent monitor failing... At least not by default. >> >> I am not aware of any situation in which a failing fence monitor >> triggers a fence. Monitoring is good -- it verifies that the fence >> device is still working. >> >> One concern particular to on-board IPMI devices is that they typically >> share the same power supply as their host. So if the machine loses >> power, the cluster can't contact the IPMI to fence it -- which means it >> will be unable to recover any resources from the lost node. (It can't >> assume the node lost power -- it's possible just network connectivity >> between the two nodes was lost.) >> >> The only way around that is to have a second fence device (such as an >> intelligent power switch). If the cluster can't reach the IPMI, it will >> try the second device. _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org