RCA: This issue was reported in multi-bridge OVS environment, having
dynamic bridges. Each VM is tied with one bridge via vhu and a VM
add/del causes bridge add/del. We found individual bridge deletion,
while deleting VM was impacting overall OVS traffic, causing temp
traffic outage even for other unrelated VMs. The packets are dropped
with datapath_drop_lock_error coverage counter. We found that when
deleting a bridge, we disable upcall for some time and flush entire
dpctl flows. As we donot have per bridge dpctl flow, so any packets
coming during this period of time will miss dpctl flow and go for
upcall, which being disabled now will be dropped.

Fix is to  bypass "ofproto->ofproto_class->flush(ofproto)", once
this callback holds the lock and dp flows flushed, leads to complete
traffic outage. As per my analysis no need to do udpif_flush() to
flush all datapath flows instead needed flows will be auto-deleted
for required bridge with port delete or destroy APIs already called
further in codeflow and bridge add/delete paths.

Signed-off-by: Vipul Ashri <[email protected]>
---
 ofproto/ofproto.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 3df64efb9..0f0e20269 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -1839,7 +1839,12 @@ ofproto_destroy(struct ofproto *p, bool del)
         return;
     }
 
-    ofproto_flush__(p, del);
+    /* Changed arg from "del" -> "false" to fix traffic outage during
+     * individual bridge deletion.
+     * DP flows cleanup shall be taken care automatically during individual
+     * port delete/destruct while processing evicting flows.
+     */
+    ofproto_flush__(p, false);
     HMAP_FOR_EACH_SAFE (ofport, hmap_node, &p->ports) {
         ofport_destroy(ofport, del);
     }
-- 
2.34.1.windows.1

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to