Johannes Freygner <han...@freygner.at> wrote: > could somebody give me an idea what will be the best stonith solution on a > drbd cluster to avoid split brain if the network between the nodes is lost. > > I have already tried to use stonith with ILO, but if the power cable is > removed from the node (because we have to service the hardware) the resource > will not start on the remaining node, because the remaining node can't fence > the removed node.
There should be nothing wrong with using ILO, but please read on. (In fact, ILO/ALOM/DRAC based fencing is the cleanest solution for enterprise grade hardware with multiple power sources.) If you're bringing something down for maintenance, fencing shouldn't occur. If you do a 'shutdown -r now' on once node, does that node normally get fenced by the other? If so, does doing a 'service corosync stop' allow that node to cleanly leave the cluster without being fenced? If the answer to both are 'yes', then you probably have an rc script sequencing problem that you should deal with first. If you're running RHEL/CentOS or a derivative, have a look at your corosync rc script. If it has # chkconfig: - 20 20 then change it to # chkconfig: - 75 25 and do a: service corosync reset Remeber to do this on all nodes. Now if you now do a 'shutdown -r now' (or -h) on one node, it should not get fenced, and your resources should all be nicely moved to the remaining node before the first node is down. Devin -- If it's sinful, it's more fun. _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker