On 10/10/2016 06:56 PM, Ken Gaillot wrote: > On 10/10/2016 10:21 AM, Klaus Wenninger wrote: >> On 10/10/2016 04:54 PM, Ken Gaillot wrote: >>> On 10/10/2016 07:36 AM, Pavel Levshin wrote: >>>> 10.10.2016 15:11, Klaus Wenninger: >>>>> On 10/10/2016 02:00 PM, Pavel Levshin wrote: >>>>>> 10.10.2016 14:32, Klaus Wenninger: >>>>>>> Why are the order-constraints between libvirt & vms optional? >>>>>> If they were mandatory, then all the virtual machines would be >>>>>> restarted when libvirtd restarts. This is not desired nor needed. When >>>>>> this happens, the node is fenced because it is unable to restart VM in >>>>>> absence of working libvirtd. >>>>> Was guessing something like that ... >>>>> So let me reformulate my question: >>>>> Why does libvirtd have to be restarted? >>>>> If it is because of config-changes making it reloadable might be a >>>>> solution ... >>>>> >>>> Right, config changes come to my mind first of all. But sometimes a >>>> service, including libvirtd, may fail unexpectedly. In this case I would >>>> prefer to restart it without disturbing VirtualDomains, which will fail >>>> eternally. >>> I think the mandatory colocation of VMs with libvirtd negates your goal. >>> If libvirtd stops, the VMs will have to stop anyway because they can't >>> be colocated with libvirtd. Making the colocation optional should fix that. >>> >>>> The question is, why the cluster does not obey optional constraint, when >>>> both libvirtd and VM stop in a single transition? >>> If it truly is in the same transition, then it should be honored. >>> >>> You have *mandatory* constraints for DLM -> CLVMd -> cluster-config -> >>> libvirtd, but only an *optional* constraint for libvirtd -> VMs. >>> Therefore, libvirtd will generally have to wait longer than the VMs to >>> be started. >>> >>> It might help to add mandatory constraints for cluster-config -> VMs. >>> That way, they have the same requirements as libvirtd, and are more >>> likely to start in the same transition. >>> >>> However I'm sure there are still problematic situations. What you want >>> is a simple idea, but a rather complex specification: "If rsc1 fails, >>> block any instances of this other RA on the same node." >>> >>> It might be possible to come up with some node attribute magic to >>> enforce this. You'd need some custom RAs. I imagine something like one >>> RA that sets a node attribute, and another RA that checks it. >>> >>> The setter would be grouped with libvirtd. Anytime that libvirtd starts, >>> the setter would set a node attribute on the local node. Anytime that >>> libvirtd stopped or failed, the setter would unset the attribute value. >>> >>> The checker would simply monitor the attribute, and fail if the >>> attribute is unset. The group would have on-fail=block. So anytime the >>> the attribute was unset, the VM would not be started or stopped. (There >>> would be no constraints between the two groups -- the checker RA would >>> take the place of constraints.) >> In how far would that behave differently to just putting libvirtd >> into this on-fail=block group? (apart from of course the >> possibility to group the vms into more than one group ...) > You could stop or restart libvirtd without stopping the VMs. It would > cause a "failure" of the checker that would need to be cleaned later, > but the VMs wouldn't stop.
Ah, yes forgot about the manual restart case. Had already turned that into a reload in my mind ;-) As long as libvirtd is a systemd-unit ... would a restart via systemd create a similar behavior? But forget about it ... with pacemaker enabling to receive systemd-events we should probably not foster this use-case ;-) > >>> I haven't thought through all possible scenarios, but it seems feasible >>> to me. >>> >>>> In my eyes, these services are bound by a HARD obvious colocation >>>> constraint: VirtualDomain should never ever be touched in absence of >>>> working libvirtd. Unfortunately, I cannot figure out a way to reflect >>>> this constraint in the cluster. >>>> >>>> >>>> -- >>>> Pavel Levshin > _______________________________________________ > Users mailing list: [email protected] > http://clusterlabs.org/mailman/listinfo/users > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: [email protected] http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
