Hi. I have a 7-node corosync / pacemaker cluster which is working nicely as a proof-of-concept.
Three machines are in data centre 1, three are in data centre 2, and one machine is in data centre 3. I'm using location contraints to run one set of resources on any of the machines in DC1, another set of resources in DC2, and the DC3 does nothing except act as a quorum server in case DC1 and DC2 lose sight of each other. Everything is currently on externally-hosted virtual machines (by which I mean, I have no access to the hosting envorinment). I now want to implement fencing, and for the PoC VMs I plan to use external/ssh in order to reboot a problem server - once things move to real hardware we shall have some sort of IPMI/RAC/PDU control. Reading https://clusterlabs.org/pacemaker/doc/crm_fencing.html it seems that I define this just the same as any other resource in the system, however it's not clear to me how many resources I need to define. When a machine needs restarting, any other machine in the cluster can do it - all have public-key SSH access to all others, and for IPMI/RAC/PDU every machine will have credentials to connect to the power controller for every other machine. So, do I simply create one stonith resource for each server, and rely on some other random server to invoke it when needed? Or do I in fact create one stonith resource for each server, and that resource then means that this server can shut down any other server? Or, do I need to create 6 x 7 = 42 stonith resources so that any machine can shut down any other? Thanks for any guidance, or pointers to more comprehensive documentation. Thanks, Antony -- BASIC is to computer languages what Roman numerals are to arithmetic. Please reply to the list; please *don't* CC me. _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/