Hi Kris, did you ever fix this issue? We're also seeing similar problems running balance-tcp bonds on OVS 2.5.0.
Thanks, Ray On Mon, Feb 27, 2017 at 4:37 AM O'Reilly, Darragh <[email protected]> wrote: > Hi, > > > > I’m also running Neutron provider networks on OVS2.5 (DPDK) with LACP > (balance-tcp), but I do not see this problem. > > > > OVS should not output a packet to the bundle it came in on: > > > https://github.com/openvswitch/ovs/blob/branch-2.5/ofproto/ofproto-dpif-xlate.c#L2321 > > > > I have no idea why it could be happening. But it does remind me of a > problem with a buggy NIC firmware in a blade system that was reflecting > some out-bound packets back in, and confusing the OVS learning tables. > > Try looking at watch –n1 “ovs-appctl fdb/show br-ext” > > > > Do the OVS logs have anything? You could try a Linux bond and see if it > makes a difference. > > > > Darragh. > > > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Kris G. Lindgren > *Sent:* 25 February 2017 17:36 > *To:* [email protected] > *Subject:* [ovs-discuss] OVS 2.5.1 in an LACP bond does not correctly > handle unicast flooding > > > > We recently upgraded from OVS 2.3.3 to OVS 2.5.1 After upgrading we > started getting mac’s for VM’s and HV’s learned on ports that they were not > connected to. After a long investigation we were able to see that OVS does > not correctly handle unicast flooding. As we would see OVS flood traffic > that was not destined to a local mac back out on of the bond members. In > the switch we see: > > > > 2017 Feb 23 12:11:20 lfassi0114-02 %FWM-6-MAC_MOVE_NOTIFICATION: Host > fa16.3ead.e6cf in vlan 413 is flapping between port Po19 and port Po22 > > 2017 Feb 23 12:11:21 lfassi0114-02 %FWM-6-MAC_MOVE_NOTIFICATION: Host > fa16.3ead.e6cf in vlan 413 is flapping between port Po22 and port Po19 > > > > On the host connected to Po22 (which is not where fa16.e3ad.e6cf lives we > see: > > 12:11:20.374794 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has > 10.198.39.254 tell 10.198.38.178, length 46 > > 12:11:20.374941 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has > 10.198.39.254 tell 10.198.38.178, length 46 > > 12:11:20.376145 00:00:0c:9f:f0:01 > fa:16:3e:ad:e6:cf, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Reply 10.198.39.254 > is-at 00:00:0c:9f:f0:01, length 46 > > 12:11:21.374628 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has > 10.198.39.254 tell 10.198.38.178, length 46 > > 12:11:21.375057 00:00:0c:9f:f0:01 > fa:16:3e:ad:e6:cf, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Reply 10.198.39.254 > is-at 00:00:0c:9f:f0:01, length 46 > > 12:11:22.374578 fa:16:3e:ad:e6:cf > 00:00:0c:9f:f0:01, ethertype 802.1Q > (0x8100), length 64: vlan 413, p 0, ethertype ARP, Request who-has > 10.198.39.254 tell 10.198.38.178, length 46 > > > > By using a span port in the network spanning only traffic sent that is > sent from the server we were also able to see that traffic destined to: > 00:00:0c:9f:f0:01 was sent back out. In this case 00:00:0c:9f:f0:01 is the > virtual mac of the HSRP gateway. Under cisco nexus 3k (I assume other > nexus products as well) when configured with VPC/LACP/HSRP any traffic > destined to the virtual mac of the hsrp gateway, that ends up on the > non-active hsrp side, will get flooded to all ports on the non-active > side. This is done so that arp packet is seen by the active side. This is > how this config from cisco has worked since day one. We have also seen > this happen in bursts where the switch will see 26k+ mac moves in a minute > and go into defense mode and stop mac-learning. We haven’t been able to > specifically catch a large storm event, but due to the way OVS is handling > unicast flooding of ARP packets, we have no reason to believe it won’t > treat unicast flooding of other traffic the exact same way. > > > > Under OVS 2.3.3 the unicast flooding behavior was correctly handled where > it was correctly dropped and packets were not flooded back out the bond > member. Changing bonding mode from balance-slb, to active-backup or > balance-tcp makes no difference the unicast traffic is still flooded back > out the bond. > > > > Our OVS config is as follows: > > ovs-vsctl: > > ac83a7ff-0157-437c-bfba-8c038ec77c74 > > Bridge br-ext > > Port br-ext > > Interface br-ext > > type: internal > > Port "bond0" > > Interface "p3p1" > > Interface "p3p2" > > Port "mgmt0" > > Interface "mgmt0" > > type: internal > > Port "ext-vlan-215" > > tag: 215 > > Interface "ext-vlan-215" > > type: patch > > options: {peer="br215-ext"} > > Bridge br-int > > fail_mode: secure > > Port "int-br215" > > Interface "int-br215" > > type: patch > > options: {peer="phy-br215"} > > Port "qvo99ae272d-f8" > > tag: 1 > > Interface "qvo99ae272d-f8" > > Port "qvo1d5492c0-df" > > tag: 1 > > Interface "qvo1d5492c0-df" > > Port br-int > > Interface br-int > > type: internal > > Port "qvo6b7f3219-90" > > tag: 1 > > Interface "qvo6b7f3219-90" > > Port "qvo3b4f81ed-f4" > > tag: 1 > > Interface "qvo3b4f81ed-f4" > > Bridge "br215" > > Port "br215" > > Interface "br215" > > type: internal > > Port "phy-br215" > > Interface "phy-br215" > > type: patch > > options: {peer="int-br215"} > > Port "br215-ext" > > Interface "br215-ext" > > type: patch > > options: {peer="ext-vlan-215"} > > ovs_version: "2.5.1" > > > > # ovs-appctl bond/show > > ---- bond0 ---- > > bond_mode: balance-slb > > bond may use recirculation: no, Recirc-ID : -1 > > bond-hash-basis: 0 > > updelay: 0 ms > > downdelay: 0 ms > > next rebalance: 2426 ms > > lacp_status: negotiated > > active slave mac: 00:8c:fa:eb:2b:74(p3p1) > > > > slave p3p1: enabled > > active slave > > may_enable: true > > hash 140: 154 kB load > > > > slave p3p2: enabled > > may_enable: true > > hash 199: 69 kB load > > hash 220: 40 kB load > > hash 234: 21 kB load > > > > # ovs-appctl lacp/show > > ---- bond0 ---- > > status: active negotiated > > sys_id: 00:8c:fa:eb:2b:74 > > sys_priority: 65534 > > aggregation key: 9 > > lacp_time: slow > > > > slave: p3p1: current attached > > port_id: 9 > > port_priority: 65535 > > may_enable: true > > > > actor sys_id: 00:8c:fa:eb:2b:74 > > actor sys_priority: 65534 > > actor port_id: 9 > > actor port_priority: 65535 > > actor key: 9 > > actor state: activity aggregation synchronized collecting > distributing > > > > partner sys_id: 02:1c:73:87:60:cd > > partner sys_priority: 32768 > > partner port_id: 52 > > partner port_priority: 32768 > > partner key: 52 > > partner state: activity aggregation synchronized > collecting distributing > > > > slave: p3p2: current attached > > port_id: 10 > > port_priority: 65535 > > may_enable: true > > > > actor sys_id: 00:8c:fa:eb:2b:74 > > actor sys_priority: 65534 > > actor port_id: 10 > > actor port_priority: 65535 > > actor key: 9 > > actor state: activity aggregation synchronized collecting > distributing > > > > partner sys_id: 02:1c:73:87:60:cd > > partner sys_priority: 32768 > > partner port_id: 32820 > > partner port_priority: 32768 > > partner key: 52 > > partner state: activity aggregation synchronized > collecting distributing > > > > # ovs-ofctl dump-flows br-ext > > NXST_FLOW reply (xid=0x4): > > cookie=0x0, duration=713896.614s, table=0, n_packets=1369078301, > n_bytes=130805436786, idle_age=0, hard_age=65534, priority=0 actions=NORMAL > > > > # ovs-ofctl dump-flows br-int > > NXST_FLOW reply (xid=0x4): > > cookie=0xb367eed8ac0e9e7d, duration=713933.475s, table=0, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=10,icmp6,in_port=2,icmp_type=136 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713932.943s, table=0, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=10,icmp6,in_port=3,icmp_type=136 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713929.414s, table=0, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=10,icmp6,in_port=5,icmp_type=136 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713928.888s, table=0, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=10,icmp6,in_port=4,icmp_type=136 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713933.280s, table=0, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, priority=10,arp,in_port=2 > actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713932.660s, table=0, > n_packets=149398, n_bytes=6274716, idle_age=4, hard_age=65534, > priority=10,arp,in_port=3 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713929.218s, table=0, > n_packets=102577, n_bytes=4308234, idle_age=7, hard_age=65534, > priority=10,arp,in_port=5 actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713928.620s, table=0, n_packets=61321, > n_bytes=2575482, idle_age=8, hard_age=65534, priority=10,arp,in_port=4 > actions=resubmit(,24) > > cookie=0xb367eed8ac0e9e7d, duration=713935.656s, table=0, > n_packets=1274428312, n_bytes=105873932966, idle_age=0, hard_age=65534, > priority=3,in_port=1,vlan_tci=0x0000 actions=mod_vlan_vid:1,NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713945.070s, table=0, n_packets=7817, > n_bytes=707680, idle_age=65534, hard_age=65534, priority=2,in_port=1 > actions=drop > > cookie=0xb367eed8ac0e9e7d, duration=713945.999s, table=0, > n_packets=82510417, n_bytes=17955154731, idle_age=0, hard_age=65534, > priority=0 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713945.936s, table=23, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=drop > > cookie=0xb367eed8ac0e9e7d, duration=713933.544s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,icmp6,in_port=2,icmp_type=136,nd_target=fe80::f816:3eff:fe49:4dff > actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713933.009s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,icmp6,in_port=3,icmp_type=136,nd_target=fe80::f816:3eff:fec7:82b9 > actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713929.482s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,icmp6,in_port=5,icmp_type=136,nd_target=fe80::f816:3eff:fe07:d92e > actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713928.951s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,icmp6,in_port=4,icmp_type=136,nd_target=fe80::f816:3eff:fe17:9919 > actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713933.410s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=2,arp_spa=10.26.87.153 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713933.344s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=2,arp_spa=10.26.52.87 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713932.877s, table=24, > n_packets=149394, n_bytes=6274548, idle_age=4, hard_age=65534, > priority=2,arp,in_port=3,arp_spa=10.26.53.163 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713932.807s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=3,arp_spa=10.26.85.208 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713932.728s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=3,arp_spa=10.26.85.209 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713929.349s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=5,arp_spa=10.26.85.218 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713929.284s, table=24, > n_packets=102573, n_bytes=4308066, idle_age=7, hard_age=65534, > priority=2,arp,in_port=5,arp_spa=10.26.53.86 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713928.817s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=4,arp_spa=10.26.87.99 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713928.752s, table=24, > n_packets=61317, n_bytes=2575314, idle_age=8, hard_age=65534, > priority=2,arp,in_port=4,arp_spa=10.26.53.197 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713928.686s, table=24, n_packets=0, > n_bytes=0, idle_age=65534, hard_age=65534, > priority=2,arp,in_port=4,arp_spa=198.71.248.104 actions=NORMAL > > cookie=0xb367eed8ac0e9e7d, duration=713945.871s, table=24, n_packets=16, > n_bytes=672, idle_age=65534, hard_age=65534, priority=0 actions=drop > > > > > > ___________________________________________________________________ > > Kris Lindgren > > Senior Linux Systems Engineer > > GoDaddy > _______________________________________________ > discuss mailing list > [email protected] > https://mail.openvswitch.org/mailman/listinfo/ovs-discuss >
_______________________________________________ discuss mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-discuss
