>>> Strahil Nikolov <[email protected]> schrieb am 04.04.2022 um 09:21 in Nachricht <[email protected]>: > Do you have a resource for starting up libvirtd and virtlockd after the > OCFS2 ?
Yes: primitive prm_libvirtd systemd:libvirtd.service ... primitive prm_lockspace_ocfs2 Filesystem ... primitive prm_virtlockd systemd:virtlockd ... clone cln_libvirtd prm_libvirtd ... clone cln_lockspace_ocfs2 prm_lockspace_ocfs ... clone cln_virtlockd prm_virtlockd ... colocation col__libvirtd__virtlockd inf: cln_libvirtd cln_virtlockd colocation col__virtlockd__lockspace_fs inf: cln_virtlockd cln_lockspace_ocfs2 colocation col__vm__libvirtd inf: ( prm_xen_v01 ... ) order ord__libvirtd__vm Mandatory: cln_libvirtd ( prm_xen_v01 ... ) order ord__lockspace_fs__virtlockd Mandatory: cln_lockspace_ocfs2 cln_virtlockd order ord__virtlockd__libvirtd Mandatory: cln_virtlockd cln_libvirtd (some resources were left out, but you get the idea) Regards, Ulrich P.S: Forgot to keep the list in replies... > > Best Regards,Strahil Nikolov > > > On Mon, Apr 4, 2022 at 10:14, Ulrich > Windl<[email protected]> wrote: >>> Strahil Nikolov > <[email protected]> schrieb am 04.04.2022 um 08:42 in > Nachricht <[email protected]>: >> So, >> it you use OCFS2 for locking, why the Hypervisor is not responding correctly > >> to the Virt RA ? > > It seems the VirtualDomain RA required libvirtd to be running, but at the > time of startup probes _nothing_ is running. > That's how I see it. > > pacemaker-controld[7029]: notice: Result of probe operation for > prm_xen_rksapv15 on rksaph18: not running > ### For whatever reason: > pacemaker-execd[7021]: notice: executing - rsc:prm_xen_v15 action:stop > call_id:197 > VirtualDomain(prm_xen_v15)[8768]: INFO: Virtual domain v15 currently has no > state, retrying. > VirtualDomain(prm_xen_v15)[8822]: ERROR: Virtual domain v15 has no state > during stop operation, bailing out. > VirtualDomain(prm_xen_v15)[8836]: INFO: Issuing forced shutdown (destroy) > request for domain v15. > VirtualDomain(prm_xen_v15)[8849]: ERROR: forced stop failed > pacemaker-controld[7029]: notice: h18-prm_xen_v15_stop_0:197 [ error: > failed to connect to the hypervisor error: failed to connect socket to > '/var/run/libvirt/libvirt-sock': no such file or directory > > That caused a repeating fencing loop. > > Regards, > Ulrich > > >> Best Regards,Strahil Nikolov >> >> >> On Mon, Apr 4, 2022 at 9:39, Ulrich >> Windl<[email protected]> >> wrote: >>> Strahil Nikolov <[email protected]> schrieb am 01.04.2022 um >> 15:20 in >> Nachricht <[email protected]>: >>> To be honest, I have never had to disable it and as far as I know it's >>> clusterwide. >>> As per my understanding, the cluster checks if the resources are running >>> before proceeding further. Of course, I might be wrong and it might not >>> help > >> >>> you. >>> Why don't you setup a shared filesystem for the libvirt's locking ? After >>> all your VMs use shared storage . >> >> ??? There is a shared OCFS2 filesystem used for locking, but that 's more a >> problem rather than a solution. >> I wrote: "libvird uses locking (virtlockd), which in turn needs a >> cluster-wide filesystem for locks across the nodes." >> >> Regards, >> Ulrich >> >>> >>> Best Regards,Strahil Nikolov >>> >>> >>> On Fri, Apr 1, 2022 at 15:01, Ulrich >>> Windl<[email protected]> wrote: >>> Strahil Nikolov >>> <[email protected]> schrieb am 01.04.2022 um 00:45 in >>> Nachricht <[email protected]>: >>> >>> Hi! >>> >>>> What about if you disable the enable-startup-probes at fencing (custom >>>> fencing that sets it to false and fails, so the next fencing device in >>>> the >>>> topology kicks in)? >>> >>> Interesting idea, but I never heard of the property before. >>> However it's cluster-wide, right? >>> >>>> When the node joins , it will skip startup probes and later a systemd >>>> service or some script check if all nodes were up for at least 15-20 min >>>> and > >> >>> >>>> enable it back ? >>> >>> Are there any expected disadvantages? >>> >>> Regards, >>> Ulrich >>> >>>> Best Regards,Strahil Nikolov >>>> >>>> >>>> On Thu, Mar 31, 2022 at 14:02, Ulrich >>>>Windl<[email protected]> wrote: >>> "Gao,Yan" >>>><[email protected]> >>>> schrieb am 31.03.2022 um 11:18 in Nachricht >>>> <[email protected]>: >>>>> On 2022/3/31 9:03, Ulrich Windl wrote: >>>>>> Hi! >>>>>> >>>>>> I just wanted to point out one thing that hit us with SLES15 SP3: >>>>>> Some failed live VM migration causing node fencing resulted in a fencing >>>>> loop, because of two reasons: >>>>>> >>>>>> 1) Pacemaker thinks that even _after_ fencing there is some migration to >>>>> "clean up". Pacemaker treats the situation as if the VM is running on >>>>> both >>>>> nodes, thus (50% chance?) trying to stop the VM on the node that just >>>>> booted > >> >>> >>>> >>>>> after fencing. That's supid but shouldn't be fatal IF there weren't... >>>>>> >>>>>> 2) The stop operation of the VM (that atually isn't running) fails, >>>>> >>>>> AFAICT it could not connect to the hypervisor, but the logic in the RA >>>>> is kind of arguable that the probe (monitor) of the VM returned "not >>>>> running", but the stop right after that returned failure... >>>>> >>>>> OTOH, the point about pacemaker is the stop of the resource on the >>>>> fenced and rejoined node is not really necessary. There has been >>>>> discussions about this here and we are trying to figure out a solution >>>>> for it: >>>>> >>>>> https://github.com/ClusterLabs/pacemaker/pull/2146#discussion_r828204919 >>>>> >>>>> For now it requires administrator's intervene if the situation happens: >>>>> 1) Fix the access to hypervisor before the fenced node rejoins. >>>> >>>> Thanks for the explanation! >>>> >>>> Unfortunately this can be tricky if libvirtd is involved (as it is here): >>>> libvird uses locking (virtlockd), which in turn needs a cluster-wird >>>> filesystem for locks across the nodes. >>>> When that filesystem is provided by the cluster, it's hard to delay node >>>> joining until filesystem, virtlockd and libvirtd are running. >>>> >>>> (The issue had been discussed before: It does not make sense to run some >>>> probes when those probes need other resources to detect the status. >>>> With just a Boolean status return at best all those probes could say "not >>>> running". Ideally a third status like "please try again some later time" >>>> would be needed, or probes should follow the dependencies of their >>>> resources > >> >>> >>>> (which may open another can of worms). >>>> >>>> Regards, >>>> Ulrich >>>> >>>> >>>>> 2) Manually cleanup the resource, which tells pacemaker it can safely >>>>> forget the historical migrate_to failure. >>>>> >>>>> Regards, >>>>> Yan >>>>> >>>>>> causing a node fence. So the loop is complete. >>>>>> >>>>>> Some details (many unrelated messages left out): >>>>>> >>>>>> Mar 30 16:06:14 h16 libvirtd[13637]: internal error: libxenlight failed >>>>>> to >>>>> restore domain 'v15' >>>>>> >>>>>> Mar 30 16:06:15 h19 pacemaker-schedulerd[7350]: warning: Unexpected >>>>>> result >>>>> (error: v15: live migration to h16 failed: 1) was recorded for migrate_to >>>>> of > >> >>> >>>> >>>>> prm_xen_v15 on h18 at Mar 30 16:06:13 2022 >>>>>> >>>>>> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Unexpected >>>>>> result >>>>> (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 >>>>> 16:13:36 2022 >>>>>> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Unexpected >>>>>> result >>>>> (OCF_TIMEOUT) was recorded for stop of prm_libvirtd:0 on h18 at Mar 30 >>>>> 16:13:36 2022 >>>>>> Mar 30 16:13:37 h19 pacemaker-schedulerd[7350]: warning: Cluster node >>>>>> h18 >>>>> will be fenced: prm_libvirtd:0 failed there >>>>>> >>>>>> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: warning: Unexpected >>>>>> result >>>>> (error: v15: live migration to h18 failed: 1) was recorded for migrate_to >>>>> of > >> >>> >>>> >>>>> prm_xen_v15 on h16 at Mar 29 23:58:40 2022 >>>>>> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: error: Resource >>>>>> prm_xen_v15 > >> >>> >>>> >>>>> is active on 2 nodes (attempting recovery) >>>>>> >>>>>> Mar 30 16:19:00 h19 pacemaker-schedulerd[7350]: notice: * Restart >>>>> prm_xen_v15 ( h18 ) >>>>>> >>>>>> Mar 30 16:19:04 h18 VirtualDomain(prm_xen_v15)[8768]: INFO: Virtual >>>>>> domain >>>>> v15 currently has no state, retrying. >>>>>> Mar 30 16:19:05 h18 VirtualDomain(prm_xen_v15)[8787]: INFO: Virtual >>>>>> domain >>>>> v15 currently has no state, retrying. >>>>>> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8822]: ERROR: Virtual >>>>>> domain >>>>> v15 has no state during stop operation, bailing out. >>>>>> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8836]: INFO: Issuing >>>>>> forced >>>>> shutdown (destroy) request for domain v15. >>>>>> Mar 30 16:19:07 h18 VirtualDomain(prm_xen_v15)[8860]: ERROR: forced stop >>>>> failed >>>>>> >>>>>> Mar 30 16:19:07 h19 pacemaker-controld[7351]: notice: Transition 124 >>>>>> action > >> >>> >>>> >>>>> 115 (prm_xen_v15_stop_0 on h18): expected 'ok' but got 'error' >>>>>> >>>>>> Note: Our cluster nodes start pacemaker during boot. Yesterday I was >>>>>> there >>>>> when the problem happened. But as we had another boot loop some time ago >>>>> I >>>>> wrote a systemd service that counts boots, and if too many happen within >>>>> a >>>>> short time, pacemaker will be disabled on that node. As it it set now, >>>>> the >>>>> counter is reset if the node is up for at least 15 minutes; if it fails >>>>> more > >> >>> >>>> >>>>> than 4 times to do so, pacemaker will be disabled. If someone wants to >>>>> try >>>>> that or give feedback, drop me a line, so I could provide the RPM >>>>> (boot-loop-handler-0.0.5-0.0.noarch)... >>>>>> >>>>>> Regards, >>>>>> Ulrich >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Manage your subscription: >>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>> >>>>>> ClusterLabs home: https://www.clusterlabs.org/ >>>>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> Manage your subscription: >>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>> >>>> ClusterLabs home: https://www.clusterlabs.org/ >>>> >>> >>> >>> >>> >> >> >> >> > > > > _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/
