I know there is a fencing agent for Xen but since Redhat have pretty much dropped all support for Xen now its not very effective for me to use it. Plus - I prefer KVM/Qemu.
In order to save everyone else the effort, attached is a somewhat restrictive qemu fencing agent. Its plainly not meant to be used for production because the hypervisor connection is unauthenticated and unencrypted. To set this up (on F10) libvirt needs to be configured to listen for plain tcp connections within the virtual network that has been setup for the nodes to operate under. Also your nodes need libvirt-python installing to use the agent. The agent takes a number of parameters, "port", "ipaddr", "action". ipaddr refers to the hypervisor IP address for which the nodes are children of. port refers to the VM name given in the hypervisor for the victim node. action is the action to take, form "status", "reboot", "on", "off". I'm pretty sure my attempt at this is lacklustre as its my first go. However I can confirm its definitely working - at least for me! Attached is my very basic config for this and the agent itself. Put the agent in /sbin of course and named fence_qemu. Since I selfishly built it for my purposes only it doesn't do anything snazzy like ssh or provide auth. Its probably not difficult to add that though as all the facilities for that live in the libvirt bindings themselves. Hope this saves people some time / effort. Good Luck!
#!/usr/bin/python ####### # Fencing agent for qemu, written by Matthew Ife. [email protected] 2009 # Licensed as GPL Version 2 ###### import os, sys, re import getopt import libvirt def virt_act(ip, vm, action): # attempt to connect to the hypervisor first try: conn = libvirt.open('qemu+tcp://'+ip+'/system') except libvirt.libvirtError, err: print "TherevirDomainLookupByName was a problem connecting to the hypervisor:", print str(err) sys.exit(2) # check the vm exists and get its infomation try: id = conn.lookupByName(vm) except libvirt.libvirtError, err: print "There was a problem finding the domain specified:", print str(err) sys.exit(2) # perform some form of action. Thanks to libvirt this is pretty simple try: state = id.info()[0] except libvirt.libvirtError, err: print "There was a problem getting the domain status:", print str(err) sys.exit(2) if action == "status": if state in (1, 2, 4, 6): # offline states # since a number of states are bad (paused and crashes) they are just considered online # for a worst case scenario print "on" elif state in (0, 3, 5): # online states # Note 0 is actually "no state" we might not want this here print "off" elif action == "on": try: id.create() except libvirt.libvirtError, err: print "A problem occured bringing the node online:"+str(err) sys.exit(2) elif action == "off": try: id.destroy() except libvirt.libvirtError, err: pass elif action == "reboot": try: id.destroy() except libvirt.libvirtError, err: # we dont care if the node is down already pass try: id.create() except libvirt.libvirtError, err: print "A problem occured bringing the node online:"+str(err) sys.exit(2) return def print_help(): print """ Usage: fence_qemu [options] -o --action The action you want to take currently supports; off, on, status, reboot -n --name The name of the virtual machine you wish to act upon -a --ip The IP or Host of the hypervisor to connect to """ sys.exit(0) def main(): f = open('/tmp/log','a') if len(sys.argv) > 1: try: opts, args = getopt.getopt(sys.argv[1:], 'o:n:a:h', [ "action", "name", "ipaddr", "help" ]) except getopt.GetoptError, err: print "You did not specify the right args" print str(err) for options, args in opts: if options in ('-o', '--action'): action = args elif options in ('-n', '--name'): vm = args elif options in ('-a', '--ipaddr'): ip = args elif options in ('-h', '--help'): print_help() else: for opt in sys.stdin.readlines(): if re.match("#",opt) or re.match("^$",opt): continue (options,args) = opt.split("=") if options == "action": action = args.rstrip() elif options in ("port","name"): vm = args.rstrip() elif options == "ipaddr": ip = args.rstrip() try: virt_act(ip, vm, action) except NameError, err: print "You did not supply the required number of parameters." print str(err) sys.exit(1) if __name__ == "__main__": main()
cluster.conf
Description: XML document
