RCA: This issue was reported in multi-bridge OVS environment, having dynamic bridges. Each VM is tied with one bridge via vhu and a VM add/del causes bridge add/del. We found individual bridge deletion, while deleting VM was impacting overall OVS traffic, causing temp traffic outage even for other unrelated VMs. The packets are dropped with datapath_drop_lock_error coverage counter. We found that when deleting a bridge, we disable upcall for some time and flush entire dpctl flows. As we donot have per bridge dpctl flow, so any packets coming during this period of time will miss dpctl flow and go for upcall, which being disabled now will be dropped.
Fix is to bypass "ofproto_class->flush(ofproto)", once this callback holds the lock and dp flows flushed, leads to complete traffic outage. As per our analysis for userspace datapath, no call to udpif_flush() is needed to flush all datapath flows. Instead needed flows will be auto-deleted for required bridge with port deletion or port destroy APIs which already called further in codeflow during bridge add/del sequence. Signed-off-by: Vipul Ashri <[email protected]> --- ofproto/ofproto.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 3df64efb9..4ef0b8997 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -1685,8 +1685,9 @@ ofproto_flush__(struct ofproto *ofproto, bool del) { struct oftable *table; - /* This will flush all datapath flows. */ - if (del && ofproto->ofproto_class->flush) { + /* This will flush all dp flows, only if datapath is not userspace. */ + if (del && ofproto->ofproto_class->flush + && !strcmp(ofproto->type, "system")) { ofproto->ofproto_class->flush(ofproto); } -- 2.34.1.windows.1 _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
