On 9 Oct 2023, at 17:06, David Marchand wrote:
> A main thread executing the 'revalidator/purge' command could race with > revalidator threads that can be dumping/sweeping the purged flows at the > same time. > > This race can be reproduced (with dpif debug logs) by running the > conntrack - ICMP related unit tests with the userspace datapath: > > 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request > revalidator/purge[], id=0 > 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev: > flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>, > packets:0, bytes:0, used:never > 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del > ufid:07046e91-30a6-4862-9048-1a76b5a88a5b > recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0), > ct_state(0),ct_zone(0),ct_mark(0),ct_label(0), > packet_type(ns=0,id=0), > eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75), > eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0, > ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0, > used:never > ... > 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev: > failed to flow_get (No such file or directory) > ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0, > bytes:0, used:never > 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN| > Failed to acquire udpif_key corresponding to unexpected flow > (No such file or directory): > ufid:07046e91-30a6-4862-9048-1a76b5a88a5b > ... > 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: "" > > To avoid this race, a first part of the fix is to pause (if not already > paused) the revalidators while the main thread is purging the datapath > flows. > > Then a second issue is observed by running the same unit test with the > kernel datapath. Its dpif implementation dumps flows via a netlink request > (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(), > nl_dump_start(), nl_sock_send__()) in the leader revalidator thread, > before pausing revalidators: > > 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request > revalidator/purge[], id=0 > ... > 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del > ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0), > skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0), > ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2, > dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1, > tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00), > packets:0, bytes:0, used:never > ... > 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: "" > ... > 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system: > flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, > packets:0, bytes:0, used:never > ... > 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system: > failed to flow_get (No such file or directory) > ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0, > bytes:0, used:never > 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN| > Failed to acquire udpif_key corresponding to unexpected flow > (No such file or directory): > ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 > > To avoid evaluating already deleted flows, the second part of the fix is > to ensure that dumping from the leader revalidator thread is done out of > any pause request. > > Fixes: 98bb4286970d ("tests: Add command to purge revalidators of flows.") > Signed-off-by: David Marchand <david.march...@redhat.com> > --- > Note: with this fix, I can make most of system-traffic tests run fine > under make check-dpdk check-kernel check-offloads. > "Most", because a unit test "datapath - basic truncate action" fails for > me though it predates this patch, and I did not understand why it fails > yet. I’m not getting this error in my test environment. I tested all datapaths and did not get any errors. > The check-offloads target can run fine if removing the exception on > "failed to flow_get" and "failed to acquire ukey" warning logs. Would it be good to add another patch in the series, to remove these? This patch looks good to me! Acked-by: Eelco Chaudron <echau...@redhat.com> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev