Hi all, We are currently running Proxmox, backed by OpenVSwitch. Up until recently we have not noticed any issues in this setup. We have upgraded our data centre switching (Juniper QFX), which by default enables an ARP Suppression feature. This now appears to be suppressing some ARP traffic, and we are now intermittently losing access to our Proxmox hosts.
The setup on one of them is as follows (all three of our hosts are experiencing the same issue though): - # Main interface allow-vmbr0 bond0 iface bond0 inet manual ovs_bonds eno3 eno4 ovs_type OVSBond ovs_bridge vmbr0 ovs_options lacp=active bond_mode=balance-tcp auto lo iface lo inet loopback # Interface to secondary network allow-vmbr1 eno1 iface eno1 inet manual ovs_type OVSPort ovs_bridge vmbr1 # Mirror to port capture server allow-vmbr0 eno2 iface eno2 inet manual ovs_type OVSPort ovs_bridge vmbr0 iface eno3 inet manual iface eno4 inet manual # Management interface allow-vmbr0 vport0 iface vport0 inet static address 10.21.0.15 netmask 255.255.255.0 gateway 10.21.0.210 ovs_type OVSIntPort ovs_bridge vmbr0 # Secondary network allow-vmbr1 vport1 iface vport1 inet static address 172.22.1.15 netmask 255.255.255.0 ovs_type OVSIntPort ovs_bridge vmbr1 ovs_options tag=100 auto vmbr0 iface vmbr0 inet manual ovs_type OVSBridge ovs_ports bond0 vport0 eno2 auto vmbr1 iface vmbr1 inet manual ovs_type OVSBridge ovs_ports eno1 vport1 We appear to be hitting some strange behaviour where two interfaces on the hosts respond to ARP, with different MACs, and interestingly only if the source address of the ARP packet is 0.0.0.0. ip a | grep -EiA2 "vmbr0|vport0" vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 18:66:da:51:b3:eb brd ff:ff:ff:ff:ff:ff vport0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether aa:83:29:09:fa:bc brd ff:ff:ff:ff:ff:ff inet 10.21.0.15/24 brd 10.21.0.255 scope global vport0 In the packet captures, we see ARP replies with a source MAC address of 18:66:da:51:b3:eb and aa:83:29:09:fa:bc. As noted, before this ARP suppression feature was enabled, both ARP replies would be seen by anything requesting it, and therefore never caused an issue. Now ARP tables in our network are getting updated with just the 18:66:da:51:b3:eb address, which blackholes traffic. We will then later see the ARP entries in our network updated again to the aa:83:29:09:fa:bc MAC address (probably due to genuine ARP requests), at which point the hosts are reachable again. We have disabled ARP Suppression for now, but the option to disable this feature will be removed in the next JunOS major version, so we need to work out what is causing both interfaces to generate the replies. We can recreate the issue using arping, by turning ARP suppression back on, and sending ARP packets to the IP with a source IP of 0.0.0.0. Using 0.0.0.0 as a source IP is appears to be valid usage of ARP, and is used for duplicate ARP detection. Unfortunately this very detection is causing duplicate ARP responses, usefully enough! $ sudo ovs-vsctl --version ovs-vsctl (Open vSwitch) 2.7.0 DB Schema 7.14.0 I'm more than happy to provide more diagnostics and more information. I have tried some of the Protocol Tracing, and it doesn't appear to give much insight as to why its happening, or that it even believes it is happening? ovs-appctl ofproto/trace vmbr0 in_port=1,arp,dl_src=88:a2:5e:e6:47:a0,dl_dst=ff:ff:ff:ff:ff:ff,arp_tpa=10.21.0.15,arp_spa=0.0.0.0,arp_op=1,arp_sha=88:a2:5e:e6:47:a0 Flow: arp,in_port=1,vlan_tci=0x0000,dl_src=88:a2:5e:e6:47:a0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=0.0.0.0,arp_tpa=10.21.0.15,arp_op=1,arp_sha=88:a2:5e:e6:47:a0,arp_tha=00:00:00:00:00:00 bridge("vmbr0") --------------- 0. priority 0 NORMAL -> no learned MAC for destination, flooding Final flow: unchanged Megaflow: recirc_id=0,arp,in_port=1,vlan_tci=0x0000/0x1fff,dl_src=88:a2:5e:e6:47:a0,dl_dst=ff:ff:ff:ff:ff:ff,arp_spa=0.0.0.0,arp_tpa=10.21.0.15,arp_op=1 Datapath actions: 1,5,30,34,40,56,65,74,79 Thanks in advance Stuart Howlette _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss