On 10/20/23 14:41, Simon Horman wrote:
> On Thu, Oct 19, 2023 at 03:27:29PM +0200, Eelco Chaudron wrote:
>>
>>
>> On 18 Oct 2023, at 16:23, David Marchand wrote:
>>
>>> This issue has been observed when running traffic tests with a dpdk
>>> enabled userspace datapath (though those tests are added in a separate
>>> series).
>>> However, the described issue also affects the kernel datapath which is
>>> why this patch is sent separately.
>>>
>>> A main thread executing the 'revalidator/purge' command could race with
>>> revalidator threads that can be dumping/sweeping the purged flows at the
>>> same time.
>>>
>>> This race can be reproduced (with dpif debug logs) by running the
>>> conntrack - ICMP related unit tests with the userspace datapath:
>>>
>>> 2023-10-09T14:11:55.242Z|00177|unixctl|DBG|received request
>>>     revalidator/purge[], id=0
>>> 2023-10-09T14:11:55.242Z|00044|dpif(revalidator17)|DBG|netdev@ovs-netdev:
>>>     flow_dump ufid:68ff6817-fb3b-4b30-8412-9cf175318294 <empty>,
>>>     packets:0, bytes:0, used:never
>>> 2023-10-09T14:11:55.242Z|00178|dpif|DBG|netdev@ovs-netdev: flow_del
>>>     ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
>>>     recirc_id(0),dp_hash(0),skb_priority(0),in_port(2),skb_mark(0),
>>>     ct_state(0),ct_zone(0),ct_mark(0),ct_label(0),
>>>     packet_type(ns=0,id=0),
>>>     eth(src=a6:0a:bf:e2:f3:f2,dst=62:23:0f:f6:2c:75),
>>>     eth_type(0x0800),ipv4(src=10.1.1.1,dst=10.1.1.2,proto=17,tos=0,
>>>     ttl=64,frag=no),udp(src=37380,dst=10000), packets:0, bytes:0,
>>>     used:never
>>> ...
>>> 2023-10-09T14:11:55.242Z|00049|dpif(revalidator17)|WARN|netdev@ovs-netdev:
>>>     failed to flow_get (No such file or directory)
>>>     ufid:07046e91-30a6-4862-9048-1a76b5a88a5b <empty>, packets:0,
>>>     bytes:0, used:never
>>> 2023-10-09T14:11:55.242Z|00050|ofproto_dpif_upcall(revalidator17)|WARN|
>>>     Failed to acquire udpif_key corresponding to unexpected flow
>>>     (No such file or directory):
>>>     ufid:07046e91-30a6-4862-9048-1a76b5a88a5b
>>> ...
>>> 2023-10-09T14:11:55.242Z|00183|unixctl|DBG|replying with success, id=0: ""
>>>
>>> To avoid this race, a first part of the fix is to pause (if not already
>>> paused) the revalidators while the main thread is purging the datapath
>>> flows.
>>>
>>> Then a second issue is observed by running the same unit test with the
>>> kernel datapath. Its dpif implementation dumps flows via a netlink request
>>> (see dpif_flow_dump_create(), dpif_netlink_flow_dump_create(),
>>> nl_dump_start(), nl_sock_send__()) in the leader revalidator thread,
>>> before pausing revalidators:
>>>
>>> 2023-10-09T14:44:28.742Z|00122|unixctl|DBG|received request
>>>     revalidator/purge[], id=0
>>> ...
>>> 2023-10-09T14:44:28.742Z|00125|dpif|DBG|system@ovs-system: flow_del
>>>     ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 recirc_id(0),dp_hash(0),
>>>     skb_priority(0),in_port(2),skb_mark(0),ct_state(0),ct_zone(0),
>>>     ct_mark(0),ct_label(0),eth(src=a6:0a:bf:e2:f3:f2,
>>>     dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.1.1.1,
>>>     tip=10.1.1.2,op=1,sha=a6:0a:bf:e2:f3:f2,tha=00:00:00:00:00:00),
>>>     packets:0, bytes:0, used:never
>>> ...
>>> 2023-10-09T14:44:28.742Z|00129|unixctl|DBG|replying with success, id=0: ""
>>> ...
>>> 2023-10-09T14:44:28.742Z|00006|dpif(revalidator21)|DBG|system@ovs-system:
>>>     flow_dump ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>,
>>>     packets:0, bytes:0, used:never
>>> ...
>>> 2023-10-09T14:44:28.742Z|00012|dpif(revalidator21)|WARN|system@ovs-system:
>>>     failed to flow_get (No such file or directory)
>>>     ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9 <empty>, packets:0,
>>>     bytes:0, used:never
>>> 2023-10-09T14:44:28.742Z|00013|ofproto_dpif_upcall(revalidator21)|WARN|
>>>     Failed to acquire udpif_key corresponding to unexpected flow
>>>     (No such file or directory):
>>>     ufid:70102d81-30a1-44b9-aa76-3d02a9ffd2c9
>>>
>>> To avoid evaluating already deleted flows, the second part of the fix is
>>> to ensure that dumping from the leader revalidator thread is done out of
>>> any pause request.
>>>
>>> As a result of this patch, the unit test "offloads - delete ufid mapping
>>> if device not exist - offloads enabled" does not need to waive the random
>>> warning logs when purging dp flows.
>>>
>>> Fixes: 98bb4286970d ("tests: Add command to purge revalidators of flows.")
>>> Acked-by: Eelco Chaudron <[email protected]>
>>> Signed-off-by: David Marchand <[email protected]>
>>
>>
>> ACKing the additional changes, and confirming that running the offload test 
>> 100 times did not show any problems!
>>
>> Acked-by: Eelco Chaudron <[email protected]>
> 
> Acked-by: Simon Horman <[email protected]>
> 

Thanks!  Applied and backported down to 2.17.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to