Erm...network/firewall is always "green". Run tcpdump on Host1 and VM2 (not on the same host). Then run again 'fence_xvm -o list' and check what is captured.
In summary, you need: - key deployed on the Hypervisours - key deployed on the VMs - fence_virtd running on both Hypervisours - Firewall opened (1229/udp for the hosts, 1229/tcp for the guests) - fence_xvm on both VMs In your case , the primary suspect is multicast traffic. Best Regards, Strahil Nikolov На 8 юли 2020 г. 16:33:45 GMT+03:00, "stefan.schm...@farmpartner-tec.com" <stefan.schm...@farmpartner-tec.com> написа: >Hello, > >>I can't find fence_virtd for Ubuntu18, but it is available for >>Ubuntu20. > >We have now upgraded our Server to Ubuntu 20.04 LTS and installed the >packages fence-virt and fence-virtd. > >The command "fence_xvm -a 225.0.0.12 -o list" on the Hosts still just >returns the single local VM. > >The same command on both VMs results in: ># fence_xvm -a 225.0.0.12 -o list >Timed out waiting for response >Operation failed > >But just as before, trying to connect from the guest to the host via nc > >just works fine. >#nc -z -v -u 192.168.1.21 1229 >Connection to 192.168.1.21 1229 port [udp/*] succeeded! > >So the hosts and service basically is reachable. > >I have spoken to our Firewall tech, he has assured me, that no local >traffic is hindered by anything. Be it multicast or not. >Software Firewalls are not present/active on any of our servers. > >Ubuntu guests: ># ufw status >Status: inactive > >CentOS hosts: >systemctl status firewalld >● firewalld.service - firewalld - dynamic firewall daemon > Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; >vendor preset: enabled) > Active: inactive (dead) > Docs: man:firewalld(1) > > >Any hints or help on how to remedy this problem would be greatly >appreciated! > >Kind regards >Stefan Schmitz > > >Am 07.07.2020 um 10:54 schrieb Klaus Wenninger: >> On 7/7/20 10:33 AM, Strahil Nikolov wrote: >>> I can't find fence_virtd for Ubuntu18, but it is available for >Ubuntu20. >>> >>> Your other option is to get an iSCSI from your quorum system and use >that for SBD. >>> For watchdog, you can use 'softdog' kernel module or you can use KVM >to present one to the VMs. >>> You can also check the '-P' flag for SBD. >> With kvm please use the qemu-watchdog and try to >> prevent using softdogwith SBD. >> Especially if you are aiming for a production-cluster ... >> >> Adding something like that to libvirt-xml should do the trick: >> <watchdog model='i6300esb' action='reset'> >> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' >> function='0x0'/> >> </watchdog> >> >>> >>> Best Regards, >>> Strahil Nikolov >>> >>> На 7 юли 2020 г. 10:11:38 GMT+03:00, >"stefan.schm...@farmpartner-tec.com" ><stefan.schm...@farmpartner-tec.com> написа: >>>>> What does 'virsh list' >>>>> give you onthe 2 hosts? Hopefully different names for >>>>> the VMs ... >>>> Yes, each host shows its own >>>> >>>> # virsh list >>>> Id Name Status >>>> ---------------------------------------------------- >>>> 2 kvm101 running >>>> >>>> # virsh list >>>> Id Name State >>>> ---------------------------------------------------- >>>> 1 kvm102 running >>>> >>>> >>>> >>>>> Did you try 'fence_xvm -a {mcast-ip} -o list' on the >>>>> guests as well? >>>> fence_xvm sadly does not work on the Ubuntu guests. The howto said >to >>>> install "yum install fence-virt fence-virtd" which do not exist as >>>> such >>>> in Ubuntu 18.04. After we tried to find the appropiate packages we >>>> installed "libvirt-clients" and "multipath-tools". Is there maybe >>>> something misisng or completely wrong? >>>> Though we can connect to both hosts using "nc -z -v -u >192.168.1.21 >>>> 1229", that just works fine. >>>> >> without fence-virt you can't expect the whole thing to work. >> maybe you can build it for your ubuntu-version from sources of >> a package for another ubuntu-version if it doesn't exist yet. >> btw. which pacemaker-version are you using? >> There was a convenience-fix on the master-branch for at least >> a couple of days (sometimes during 2.0.4 release-cycle) that >> wasn't compatible with fence_xvm. >>>>> Usually, the biggest problem is the multicast traffic - as in >many >>>>> environments it can be dropped by firewalls. >>>> To make sure I have requested our Datacenter techs to verify that >>>> multicast Traffic can move unhindered in our local Network. But in >the >>>> past on multiple occasions they have confirmed, that local traffic >is >>>> not filtered in any way. But Since now I have never specifically >asked >>>> for multicast traffic, which I now did. I am waiting for an answer >to >>>> that question. >>>> >>>> >>>> kind regards >>>> Stefan Schmitz >>>> >>>> Am 06.07.2020 um 11:24 schrieb Klaus Wenninger: >>>>> On 7/6/20 10:10 AM, stefan.schm...@farmpartner-tec.com wrote: >>>>>> Hello, >>>>>> >>>>>>>> # fence_xvm -o list >>>>>>>> kvm102 >>>> bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 >>>>>>>> on >>>>>>> This should show both VMs, so getting to that point will likely >>>> solve >>>>>>> your problem. fence_xvm relies on multicast, there could be some >>>>>>> obscure network configuration to get that working on the VMs. >>>>> You said you tried on both hosts. What does 'virsh list' >>>>> give you onthe 2 hosts? Hopefully different names for >>>>> the VMs ... >>>>> Did you try 'fence_xvm -a {mcast-ip} -o list' on the >>>>> guests as well? >>>>> Did you try pinging via the physical network that is >>>>> connected tothe bridge configured to be used for >>>>> fencing? >>>>> If I got it right fence_xvm should supportcollecting >>>>> answersfrom multiple hosts but I found a suggestion >>>>> to do a setup with 2 multicast-addresses & keys for >>>>> each host. >>>>> Which route did you go? >>>>> >>>>> Klaus >>>>>> Thank you for pointing me in that direction. We have tried to >solve >>>>>> that but with no success. We were using an howto provided here >>>>>> https://wiki.clusterlabs.org/wiki/Guest_Fencing >>>>>> >>>>>> Problem is, it specifically states that the tutorial does not yet >>>>>> support the case where guests are running on multiple hosts. >There >>>> are >>>>>> some short hints what might be necessary to do, but working >through >>>>>> those sadly just did not work nor where there any clues which >would >>>>>> help us finding a solution ourselves. So now we are completely >stuck >>>>>> here. >>>>>> >>>>>> Has someone the same configuration with Guest VMs on multiple >hosts? >>>>>> And how did you manage to get that to work? What do we need to do >to >>>>>> resolve this? Is there maybe even someone who would be willing to >>>> take >>>>>> a closer look at our server? Any help would be greatly >appreciated! >>>>>> >>>>>> Kind regards >>>>>> Stefan Schmitz >>>>>> >>>>>> >>>>>> >>>>>> Am 03.07.2020 um 02:39 schrieb Ken Gaillot: >>>>>>> On Thu, 2020-07-02 at 17:18 +0200, >>>> stefan.schm...@farmpartner-tec.com >>>>>>> wrote: >>>>>>>> Hello, >>>>>>>> >>>>>>>> I hope someone can help with this problem. We are (still) >trying >>>> to >>>>>>>> get >>>>>>>> Stonith to achieve a running active/active HA Cluster, but >sadly >>>> to >>>>>>>> no >>>>>>>> avail. >>>>>>>> >>>>>>>> There are 2 Centos Hosts. On each one there is a virtual Ubuntu >>>> VM. >>>>>>>> The >>>>>>>> Ubuntu VMs are the ones which should form the HA Cluster. >>>>>>>> >>>>>>>> The current status is this: >>>>>>>> >>>>>>>> # pcs status >>>>>>>> Cluster name: pacemaker_cluster >>>>>>>> WARNING: corosync and pacemaker node names do not match (IPs >used >>>> in >>>>>>>> setup?) >>>>>>>> Stack: corosync >>>>>>>> Current DC: server2ubuntu1 (version 1.1.18-2b07d5c5a9) - >partition >>>>>>>> with >>>>>>>> quorum >>>>>>>> Last updated: Thu Jul 2 17:03:53 2020 >>>>>>>> Last change: Thu Jul 2 14:33:14 2020 by root via cibadmin on >>>>>>>> server4ubuntu1 >>>>>>>> >>>>>>>> 2 nodes configured >>>>>>>> 13 resources configured >>>>>>>> >>>>>>>> Online: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>> >>>>>>>> Full list of resources: >>>>>>>> >>>>>>>> stonith_id_1 (stonith:external/libvirt): Stopped >>>>>>>> Master/Slave Set: r0_pacemaker_Clone [r0_pacemaker] >>>>>>>> Masters: [ server4ubuntu1 ] >>>>>>>> Slaves: [ server2ubuntu1 ] >>>>>>>> Master/Slave Set: WebDataClone [WebData] >>>>>>>> Masters: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>> Clone Set: dlm-clone [dlm] >>>>>>>> Started: [ server2ubuntu1 server4ubuntu1 ] >>>>>>>> Clone Set: ClusterIP-clone [ClusterIP] (unique) >>>>>>>> ClusterIP:0 (ocf::heartbeat:IPaddr2): >Started >>>>>>>> server2ubuntu1 >>>>>>>> ClusterIP:1 (ocf::heartbeat:IPaddr2): >Started >>>>>>>> server4ubuntu1 >>>>>>>> Clone Set: WebFS-clone [WebFS] >>>>>>>> Started: [ server4ubuntu1 ] >>>>>>>> Stopped: [ server2ubuntu1 ] >>>>>>>> Clone Set: WebSite-clone [WebSite] >>>>>>>> Started: [ server4ubuntu1 ] >>>>>>>> Stopped: [ server2ubuntu1 ] >>>>>>>> >>>>>>>> Failed Actions: >>>>>>>> * stonith_id_1_start_0 on server2ubuntu1 'unknown error' (1): >>>>>>>> call=201, >>>>>>>> status=Error, exitreason='', >>>>>>>> last-rc-change='Thu Jul 2 14:37:35 2020', queued=0ms, >>>>>>>> exec=3403ms >>>>>>>> * r0_pacemaker_monitor_60000 on server2ubuntu1 'master' (8): >>>>>>>> call=203, >>>>>>>> status=complete, exitreason='', >>>>>>>> last-rc-change='Thu Jul 2 14:38:39 2020', queued=0ms, >>>> exec=0ms >>>>>>>> * stonith_id_1_start_0 on server4ubuntu1 'unknown error' (1): >>>>>>>> call=202, >>>>>>>> status=Error, exitreason='', >>>>>>>> last-rc-change='Thu Jul 2 14:37:39 2020', queued=0ms, >>>>>>>> exec=3411ms >>>>>>>> >>>>>>>> >>>>>>>> The stonith resoursce is stopped and does not seem to work. >>>>>>>> On both hosts the command >>>>>>>> # fence_xvm -o list >>>>>>>> kvm102 >>>> bab3749c-15fc-40b7-8b6c-d4267b9f0eb9 >>>>>>>> on >>>>>>> This should show both VMs, so getting to that point will likely >>>> solve >>>>>>> your problem. fence_xvm relies on multicast, there could be some >>>>>>> obscure network configuration to get that working on the VMs. >>>>>>> >>>>>>>> returns the local VM. Apparently it connects through the >>>>>>>> Virtualization >>>>>>>> interface because it returns the VM name not the Hostname of >the >>>>>>>> client >>>>>>>> VM. I do not know if this is how it is supposed to work? >>>>>>> Yes, fence_xvm knows only about the VM names. >>>>>>> >>>>>>> To get pacemaker to be able to use it for fencing the cluster >>>> nodes, >>>>>>> you have to add a pcmk_host_map parameter to the fencing >resource. >>>> It >>>>>>> looks like >pcmk_host_map="nodename1:vmname1;nodename2:vmname2;..." >>>>>>> >>>>>>>> In the local network, every traffic is allowed. No firewall is >>>>>>>> locally >>>>>>>> active, just the connections leaving the local network are >>>>>>>> firewalled. >>>>>>>> Hence there are no coneection problems between the hosts and >>>> clients. >>>>>>>> For example we can succesfully connect from the clients to the >>>> Hosts: >>>>>>>> # nc -z -v -u 192.168.1.21 1229 >>>>>>>> Ncat: Version 7.50 ( https://nmap.org/ncat ) >>>>>>>> Ncat: Connected to 192.168.1.21:1229. >>>>>>>> Ncat: UDP packet sent successfully >>>>>>>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >>>>>>>> >>>>>>>> # nc -z -v -u 192.168.1.13 1229 >>>>>>>> Ncat: Version 7.50 ( https://nmap.org/ncat ) >>>>>>>> Ncat: Connected to 192.168.1.13:1229. >>>>>>>> Ncat: UDP packet sent successfully >>>>>>>> Ncat: 1 bytes sent, 0 bytes received in 2.03 seconds. >>>>>>>> >>>>>>>> >>>>>>>> On the Ubuntu VMs we created and configured the the stonith >>>> resource >>>>>>>> according to the howto provided here: >>>>>>>> >>>> >https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/pdf/Clusters_from_Scratch/Pacemaker-1.1-Clusters_from_Scratch-en-US.pdf >>>>>>>> >>>>>>>> The actual line we used: >>>>>>>> # pcs -f stonith_cfg stonith create stonith_id_1 >external/libvirt >>>>>>>> hostlist="Host4,host2" >>>>>>>> hypervisor_uri="qemu+ssh://192.168.1.21/system" >>>>>>>> >>>>>>>> >>>>>>>> But as you can see in in the pcs status output, stonith is >stopped >>>>>>>> and >>>>>>>> exits with an unkown error. >>>>>>>> >>>>>>>> Can somebody please advise on how to procced or what additionla >>>>>>>> information is needed to solve this problem? >>>>>>>> Any help would be greatly appreciated! Thank you in advance. >>>>>>>> >>>>>>>> Kind regards >>>>>>>> Stefan Schmitz >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>> _______________________________________________ >>>>>> Manage your subscription: >>>>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>>>> >>>>>> ClusterLabs home: https://www.clusterlabs.org/ >>>>>> >> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/