Hi Andrew, hi all. I decided to return to this issue again because of issues with libvirt/KVM virtual domains controlled by pacemaker.
libvirt package on Fedora 13 has two init scripts: libvirtd and libvirt-guests. They have following chkconfig values: libvirtd: 97 03 libvirt-guests: 98 02 Currently pacemaker MCP has 90 10. If one wants to control libvirtd and virtual domains as HA resources from within pacemaker, the first solution which comes to mind would be to disable both libvirtd init scripts (set them to 'off' state). So, chkconfig libvirtd off chkconfig libvirt-guests off, Then add lsb libvirtd resource clone to pacemaker and then add VirtualDomain resources. I actually didn't try to move libvirtd control to pacemaker yet, just discovering possible pitfalls. Unfortunately, this will not (?) work as seamlessly as expected: While libvirtd will be skipped during initscripts start sequence and started by pacemaker, which is OK, there should be some problems during stop sequence execution: 1) (02) init stops libvirt-guests (saves their state and powers them off) 2) (03) init stops libvirtd. 3) (10) init sends stop signal to pacemaker MCP 4) pacemaker does unneeded movements trying to recover resources (I suppose so) What I see with libvirtd run from init - virtual domains hibernated, and pacemaker starts them again right after that (it doesn't know that system is shutting down yet). Then libvirtd is stopped and pacemaker looses control on VirtualDomain resources, moving them to 'Started (unmanaged)' state. Then pacemaker hangs (for a long time at least) trying to stop all resources. I suppose that this is where stonith should do the trick (it is disabled yet). I understand that my setup could be considered "broken" in its current state, but problem is a bit wider. Actually, no LSB resources should be stopped be init while pacemaker runs, because that resources could be and will be (incorrectly) considered by init as a subject to control. Next what one can do is to remove such LSB resources from init's "service zone" by issuing "chkconfig --del <service>". That will work, but if some RPM package has "broken" (actually not) 'post' script, which unconditionally add service to init's service zone again, then after upgrade of such package system will return to the same state as before. So, the next solution would be to move pacemaker to run really last (99) and stop really first (01). This is what Vadim Chepkov suggested earlier and what I am inclined to do (at least for my RPM packages). Of course, there are services which have 99 01 too, but I'd shut eyes on them. >From the logical point of view that would be correct to move (generally speaking) "not-native" engine which controls LSB services (and which is an LSB service itself) to the very end of startup sequence to not interfere with "native" tools: when native tools go to play (for stop sequence) everything from higher layer is already done. Looking forward to hear any comments, Vladislav _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker