Re: [Pacemaker] Two-Nodes Cluster fencing : Best Practices

Bruno MACADRÉ Fri, 26 Jul 2013 02:12:13 -0700

Thanks for your answer, It will confirm all of my doubts

I've tried this yesterday and the 2 nodes was unpowered instantly (thiscause some troubles on reboot with FS unmounted uncleanly), is there away to do a clean shutdown instead of a poweroff ?

After some reflexion, I've decided to put a third node (simpleworkstation) as an arbiter with only fencing primitives on it. Is that agood idea ? Is this solution reliable ?


Regards,
Bruno

Le 25/07/2013 16:53, Digimer a écrit :

With two-node clusters, quorum can't be used. This is fine *if* youhave good fencing. If the nodes partition (ie: network failure), bothwill try to fence the other. In theory, the faster node will power offthe other node before the slower node can kill the faster node. Inpractice, this isn't always the case.
IPMI (and iDRAC, etc) are independent devices. So it is possible forboth nodes to initiate a power-down on the other before either dies.To avoid this, you will want to set a delay for the primary/activenode's fence primitive.
Say "node1" is your active node and "node2" is your backup. You wouldset a delay of, say, 15 seconds against "node1". Now if there is apartition, node1 would look up how to fence node2 and immediatelyinitiate power off. Node 2, however, would look up how to fence node1,see a 15 second delay, and start a timer before calling the power-off.Of course, node2 will die before the timer expires.
You can also disabled acpid on the nodes, too. With that disabled,"pressing the power button" will result in a near-instant off. If youdo this, reducing your delay to 5 seconds would probably be plenty.
There is another issue to be aware of; "Fence loops". The problem withtwo node clusters and not using quorum is that a single node can fencethe other. So lets continue our example above...
Node 2 will eventually reboot. If you have pacemaker set to start onboot, it will start, wait to connect to node1 (which it can't becausethe network failure remains), call a fence to put node1 into a knownstate, pause for 15 seconds and then initiate a power off. Node 1 diesand the services recover on Node 2. Now, node1 boots back up, startsit's pacemaker.... Endless loop of fence -> recover until the networkis fixed.
To avoid this, simple do not start pacemaker on boot.
As to the specifics, you can test fencing configurations easily bydirectly calling the fence agent at the command line. I do not useDRAC, so I can't speak to specifics. I think you need to set lanplusand possibly define the console prompt to expect.
Using a generic IPMI as an example;

fence_ipmilan -a 192.168.100.1 -l ipmiuser -p ipmipwd -o status
fence_ipmilan -a 192.168.100.2 -l ipmiuser -p ipmipwd -o status
If this returns the power state, then it is simple to convert to apacemaker config.
configure primitive pStN1 stonith:fence_ipmilan params \
 ipaddr=192.168.100.1 login=ipmiuser passwd=ipmipwd delay=15 \
 op monitor interval=60s
configure primitive pStN2 stonith:fence_ipmilan params \
 ipaddr=192.168.100.2 login=ipmiuser passwd=ipmipwd \
 op monitor interval=60s
Again, I *think* you need to set a couple extra options for DRAC.Experiment at the command line before moving to the pacemaker config.Once you have the command line version working, you should be able toset it up in pacemaker. If you have trouble though, share the CLI calland we can help with the pacemaker config.
On 25/07/13 05:39, Bruno MACADRÉ wrote:
Some modifications about my first mail :

After some researches I found that external/ipmi isn't available on my
system, so I must use fence-agents.

My second question must be modified to relfect this changes like this :

     configure primitive pStN1 stonith:fence_ipmilan params
ipaddr=192.168.100.1 login=ipmiuser passwd=ipmipwd
     configure primitive pStN2 stonith:fence_ipmilan params
ipaddr=192.168.100.2 login=ipmiuser passwd=ipmipwd

Regards,
Bruno

Le 25/07/2013 10:39, Bruno MACADRÉ a écrit :
Hi,

    I've just made a two-nodes Active/Passive cluster to have an iSCSI
Failover SAN.

    Some details about my configuration :

        - I've two nodes with 2 bonds : 1 for DRBD replication and 1
for communication
        - iSCSI Target, iSCSI Lun and VirtualIP are constraints
together to start on Master DRBD node

    All work fine, but now, I need to configure fencing. I've 2 DELL
PowerEdge servers with iDRAC6.

    First question, is 'external/drac5' compatible with iDrac6 (I've
read all and nothing about this...) ?

    Second question, is that configuration sufficient (with ipmi) ?

        configure primitive pStN1 stonith:external/ipmi hostname=node1
ipaddr=192.168.100.1 userid=ipmiuser passwd=ipmipwd interface=lan
        configure primitive pStN2 stonith:external/ipmi hostname=node2
ipaddr=192.168.100.2 userid=ipmiuser passwd=ipmipwd interface=lan
        location lStN1 pStN1 inf: node1
        location lStN2 pStN2 inf: node2

        And after all :
        configure property stonith-enabled=true
        configure property stonith-action=poweroff

    Third (and last) question, what about quorum ? At the moment I've
'no-quorum-policy="ignore"' but it's a risk isn't it ?

    Don't hesitate to request me for more information if needed,

    Regards,
    Bruno.


--

Bruno MACADRE
-------------------------------------------------------------------
 Ingénieur Systèmes et Réseau     | Systems and Network Engineer
 Département Informatique         | Department of computer science
 Responsable Info SER             | SER IT Manager
 Université de Rouen              | University of Rouen
-------------------------------------------------------------------
Coordonnées / Contact :
        Université de Rouen
        Faculté des Sciences et Techniques - Madrillet
        Avenue de l'Université - BP12
        76801 St Etienne du Rouvray CEDEX
        FRANCE

        Tél : +33 (0)2-32-95-51-86
-------------------------------------------------------------------


_______________________________________________
Pacemaker mailing list: [email protected]
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [Pacemaker] Two-Nodes Cluster fencing : Best Practices

Reply via email to