On 7/3/24 12:28, Roi Dayan via dev wrote:
> 
> 
> On 03/07/2024 13:16, Eelco Chaudron wrote:
>>
>>
>> On 3 Jul 2024, at 9:33, Roi Dayan wrote:
>>
>>> It is observed in some environments that there are much more ukeys than
>>> actual DP flows. For example:
>>>
>>> $ ovs-appctl upcall/show
>>> system@ovs-system:
>>> flows : (current 7) (avg 6) (max 117) (limit 2125)
>>> offloaded flows : 525
>>> dump duration : 1063ms
>>> ufid enabled : true
>>>
>>> 23: (keys 3612)
>>> 24: (keys 3625)
>>> 25: (keys 3485)
>>>
>>> The revalidator threads are busy revalidating the stale ukeys leading to
>>> high CPU and long dump duration.
>>>
>>> This patch adds checks in the sweep phase for such ukeys and move them
>>> to DELETE so that they can be cleared eventually.
>>
>> Thank Roi for the patch update, one small issue below. Let’s also discuss a 
>> bit more a potential testcase before sending a v3.
>>
>> Cheers,
>>
>> Eelco
>>
>>
> 
> I also replied in v0 but replying here so we can continue the discussion here 
> in v2.
> 
> I did this for testing this case:
> 
> - create bridge with 2 veth ports. configure ips.
> - ping between the ports to have tc rules.
> - repeated few times: clear the tc rules like this: tc filter del dev veth1 
> ingress and also on the 2nd port.
> - set max-idle to 1 and remove it to cause a flush of the rules.
> - create another set of veth ports. add/del veth4 from the bridge a few times 
> to cause a sweep.
> - before the fix: ovs-appctl upcall/show will show ukeys.
> - after the fix upcall/show will show 0 ukeys.
> 
> 
> what do you think?
> I think if there is a cleaner way to do a sweep with purge=false as just
> will just skip seq mismatch check and just set every ukey to delete.

One problem here is that flow dumps are inherently racy.
We can dump the same flow multiple times as well as not
receive some flows that are actually in the datapath.
So, we can't just remove ukeys if we missed a flow once
or twice.  If we didn't see the flow in the dumps for the
max-idle interval, that might be a better indicator that
something is wrong.

One more problem here however is that ukey statistics can
keep increasing if there is some traffic that matches.
Since the datapath flow doesn't exist, all this traffic
will go to userspace updating the ukey stats, but the
flow will never be installed again.  And we will not be
able to fix that in this patch since the stats will grow.

We need a way to tell that we haven't seen this flow in
the dumps for a long time and stats are not a good indicator
in this case.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to