Hi David,

On Wed, Oct 18, 2023 at 01:49:08PM +0200, David Marchand wrote:
> Hello Simon,
> 
> On Fri, Oct 13, 2023 at 2:53 PM David Marchand
> <david.march...@redhat.com> wrote:
> > On Tue, Oct 10, 2023 at 5:10 PM Simon Horman <ho...@ovn.org> wrote:
> > >
> > > On Mon, Oct 09, 2023 at 05:06:51PM +0200, David Marchand wrote:
> > > > A main thread executing the 'revalidator/purge' command could race with
> > > > revalidator threads that can be dumping/sweeping the purged flows at the
> > > > same time.
> > > >
> > > > This race can be reproduced (with dpif debug logs) by running the
> > > > conntrack - ICMP related unit tests with the userspace datapath:
> > > >
> > > > 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request
> > > >       revalidator/purge[], id=0
> > > > 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev:
> > > >       flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>,
> > > >       packets:0, bytes:0, used:never
> > > > 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del
> > > >       ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
> > > >       recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),
> > > >       ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),
> > > >       packet_type(ns=0,id=0),
> > > >       eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75),
> > > >       eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,
> > > >       ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0,
> > > >       used:never
> > > > ...
> > > > 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev:
> > > >       failed to flow_get (No such file or directory)
> > > >       ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0,
> > > >       bytes:0, used:never
> > > > 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN|
> > > >       Failed to acquire udpif_key corresponding to unexpected flow
> > > >       (No such file or directory):
> > > >       ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
> > > > ...
> > > > 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: 
> > > > ""
> > > >
> > > > To avoid this race, a first part of the fix is to pause (if not already
> > > > paused) the revalidators while the main thread is purging the datapath
> > > > flows.
> > > >
> > > > Then a second issue is observed by running the same unit test with the
> > > > kernel datapath. Its dpif implementation dumps flows via a netlink 
> > > > request
> > > > (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(),
> > > > nl_dump_start(), nl_sock_send__()) in the leader revalidator thread,
> > > > before pausing revalidators:
> > > >
> > > > 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request
> > > >       revalidator/purge[], id=0
> > > > ...
> > > > 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del
> > > >       ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0),
> > > >       skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),
> > > >       ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2,
> > > >       dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,
> > > >       tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00),
> > > >       packets:0, bytes:0, used:never
> > > > ...
> > > > 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: 
> > > > ""
> > > > ...
> > > > 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system:
> > > >       flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>,
> > > >       packets:0, bytes:0, used:never
> > > > ...
> > > > 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system:
> > > >       failed to flow_get (No such file or directory)
> > > >       ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0,
> > > >       bytes:0, used:never
> > > > 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN|
> > > >       Failed to acquire udpif_key corresponding to unexpected flow
> > > >       (No such file or directory):
> > > >       ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9
> > > >
> > > > To avoid evaluating already deleted flows, the second part of the fix is
> > > > to ensure that dumping from the leader revalidator thread is done out of
> > > > any pause request.
> > > >
> > > > Fixes: 98bb4286970d ("tests: Add command to purge revalidators of 
> > > > flows.")
> > > > Signed-off-by: David Marchand <david.march...@redhat.com>
> > > > ---
> > >
> > > Hi David,
> > >
> > > I'm not having much luck reproducing the above.
> > > Any guidance would be much appreciated.
> >
> > Sorry, my fault.
> >
> > I wrote and tested this series against my system-dpdk series (patch 7
> > pulls the system-traffic.at tests in check-dpdk).
> > https://patchwork.ozlabs.org/project/openvswitch/list/?series=371864&state=%2A&archive=both
> >
> >
> > I can't reproduce the issue with check-system-userspace on the current 
> > master.
> > But I do reproduce a failure on this test with check-kernel (rarely)
> > on the current master, on my fedora 37 system.
> > If I enable dpif:dbg logs in the test, it is more frequent:
> >
> > $ git diff
> > diff --git a/tests/system-traffic.at b/tests/system-traffic.at
> > index 418cd32fec..0844ada0fd 100644
> > --- a/tests/system-traffic.at
> > +++ b/tests/system-traffic.at
> > @@ -3654,6 +3654,7 @@ AT_SETUP([conntrack - ICMP related])
> >  AT_SKIP_IF([test $HAVE_NC = no])
> >  CHECK_CONNTRACK()
> >  OVS_TRAFFIC_VSWITCHD_START()
> > +AT_CHECK([ovs-appctl vlog/set dpif:dbg])
> >
> >  ADD_NAMESPACES(at_ns0, at_ns1)
> >
> >
> > # while true; do make -C build check-kernel TESTSUITEFLAGS="-d 78" ||
> > break; done
> > ...
> >  78: conntrack - ICMP related                        FAILED
> > (system-traffic.at:3687)
> >
> > Then, looking for the flow that triggers a warning log message:
> >
> > # grep -E 'unixctl|544ea5f9-7b93-48e4-a5da-ec76be873602'
> > build/tests/system-kmod-testsuite.dir/078/ovs-vswitchd.log
> > 2023-10-13T12:26:01.151Z|00037|unixctl|DBG|received request
> > vlog/set["dpif:dbg"], id=0
> > 2023-10-13T12:26:01.151Z|00038|unixctl|DBG|replying with success, id=0: ""
> > 2023-10-13T12:26:01.269Z|00002|dpif(handler2)|DBG|system@ovs-system:
> > put[create] ufid:544ea5f9-7b93-48e4-a5da-ec76be873602
> > recirc_id(0),dp_hash(0/0),skb_priority(0/0),in_port(2),skb_mark(0/0),ct_state(0/0x24),ct_zone(0/0),ct_mark(0),ct_label(0/0),eth(src=a6:0a:bf:e2:f3:f2,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2/00:00:00:00:00:00,tha=00:00:00:00:00:00/00:00:00:00:00:00),
> > actions:1,3
> > 2023-10-13T12:26:01.274Z|00122|unixctl|DBG|received request
> > revalidator/purge[], id=0
> > 2023-10-13T12:26:01.274Z|00025|dpif(revalidator18)|DBG|system@ovs-system:
> > flow_dump ufid:544ea5f9-7b93-48e4-a5da-ec76be873602 <empty>,
> > packets:0, bytes:0, used:never
> > 2023-10-13T12:26:01.274Z|00123|dpif|DBG|system@ovs-system: flow_del
> > ufid:544ea5f9-7b93-48e4-a5da-ec76be873602
> > recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00),
> > packets:0, bytes:0, used:never
> > 2023-10-13T12:26:01.274Z|00029|dpif(revalidator18)|WARN|system@ovs-system:
> > failed to flow_get (No such file or directory)
> > ufid:544ea5f9-7b93-48e4-a5da-ec76be873602 <empty>, packets:0, bytes:0,
> > used:never
> > 2023-10-13T12:26:01.274Z|00030|ofproto_dpif_upcall(revalidator18)|WARN|Failed
> > to acquire udpif_key corresponding to unexpected flow (No such file or
> > directory): ufid:544ea5f9-7b93-48e4-a5da-ec76be873602
> > 2023-10-13T12:26:01.275Z|00128|unixctl|DBG|replying with success, id=0: ""
> > 2023-10-13T12:26:01.331Z|00157|unixctl|DBG|received request
> > exit["--cleanup"], id=0
> > 2023-10-13T12:26:01.391Z|00165|unixctl|DBG|replying with success, id=0: ""
> 
> Did you manage to reproduce the issue?
> Do you want me to change something in the patch?

Sorry, no, I didn't have a chance to reproduce the problem.
But also, no, I don't have any changes to request.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to