** Description changed:

  Seems like collateral from
  https://bugs.launchpad.net/neutron/+bug/1751396
  
  In DVR, the distributed gateway port's IP and MAC are shared in the
  qrouter across all hosts.
  
  The dvr_process_flow on the physical bridge (which replaces the shared
  router_distributed MAC address with the unique per-host MAC when its the
  source), is missing, and so is the drop rule which instructs the bridge
  to drop all traffic destined for the shared distributed MAC.
  
  Because of this, we are seeing the router MAC on the network
  infrastructure, causing it on flap on br-int on every compute host:
  
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    1
+    11     4  fa:16:3e:42:a2:ec    1
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    2
+    11     4  fa:16:3e:42:a2:ec    2
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-     1     4  fa:16:3e:42:a2:ec    0
+     1     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    0
+    11     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    0
+    11     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-     1     4  fa:16:3e:42:a2:ec    0
+     1     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-     1     4  fa:16:3e:42:a2:ec    0
+     1     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-     1     4  fa:16:3e:42:a2:ec    0
+     1     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-     1     4  fa:16:3e:42:a2:ec    1
+     1     4  fa:16:3e:42:a2:ec    1
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    0
+    11     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    0
+    11     4  fa:16:3e:42:a2:ec    0
  root@milhouse:~# ovs-appctl fdb/show br-int | grep fa:16:3e:42:a2:ec
-    11     4  fa:16:3e:42:a2:ec    0
+    11     4  fa:16:3e:42:a2:ec    0
  
+ Where port 1 is phy-br-vlan, connecting to the physical bridge, and port
+ 11 is the correct local qr-interface. Because these dvr flows are
+ missing on br-vlan, pkts w/ source mac ingress into the host and br-int
+ learns it upstream.
  
- Where port 1 is phy-br-vlan, connecting to the physical bridge, and port 11 
is the correct local qr-interface. Because these dvr flows are missing on 
br-vlan, pkts w/ source mac ingress into the host and br-int learns it upstream.
- 
- 
- The symptom is when pinging a VM's floating IP, we see occasional packet loss 
(10-30%), and sometimes the responses are sent upstream by br-int instead of 
the qrouter, so the ICMP replies come with fixed IP of the replier since no 
NAT'ing took place, and on the tenant network rather than external network.
+ The symptom is when pinging a VM's floating IP, we see occasional packet
+ loss (10-30%), and sometimes the responses are sent upstream by br-int
+ instead of the qrouter, so the ICMP replies come with fixed IP of the
+ replier since no NAT'ing took place, and on the tenant network rather
+ than external network.
  
  When I force net_shared_only to False here, the problem goes away:
  
https://github.com/openstack/neutron/blob/stable/pike/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_dvr_neutron_agent.py#L436
  
  It should we noted we *ONLY* need to do this on our dvr_snat host. The
  dvr process's are missing on every compute host. But if we shut qrouter
  on the snat host, FIP functionality works and DVR mac stops flapping on
  others. Or if we apply fix only to snat host, it works. Perhaps there is
  something on SNAT node that is unique
+ 
+ 
+ Ubuntu SRU details:
+ -------------------
+ [Impact]
+ See above
+ 
+ [Test Case]
+ Deploy OpenStack with dvr enabled and then follow the steps above.
+ 
+ [Regression Potential]
+ The patches that are backported have already landed upstream in the 
corresponding stable branches, helping to minimize any regression potential.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1783654

Title:
  DVR process flow not installed on physical bridge for shared tenant
  network

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1783654/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to