On 9 Jul 2024, at 23:38, Ilya Maximets wrote:

> On 7/3/24 12:28, Roi Dayan via dev wrote:
>>
>>
>> On 03/07/2024 13:16, Eelco Chaudron wrote:
>>>
>>>
>>> On 3 Jul 2024, at 9:33, Roi Dayan wrote:
>>>
>>>> It is observed in some environments that there are much more ukeys than
>>>> actual DP flows. For example:
>>>>
>>>> $ ovs-appctl upcall/show
>>>> system@ovs-system:
>>>> flows : (current 7) (avg 6) (max 117) (limit 2125)
>>>> offloaded flows : 525
>>>> dump duration : 1063ms
>>>> ufid enabled : true
>>>>
>>>> 23: (keys 3612)
>>>> 24: (keys 3625)
>>>> 25: (keys 3485)
>>>>
>>>> The revalidator threads are busy revalidating the stale ukeys leading to
>>>> high CPU and long dump duration.
>>>>
>>>> This patch adds checks in the sweep phase for such ukeys and move them
>>>> to DELETE so that they can be cleared eventually.
>>>
>>> Thank Roi for the patch update, one small issue below. Let’s also discuss a 
>>> bit more a potential testcase before sending a v3.
>>>
>>> Cheers,
>>>
>>> Eelco
>>>
>>>
>>
>> I also replied in v0 but replying here so we can continue the discussion 
>> here in v2.
>>
>> I did this for testing this case:
>>
>> - create bridge with 2 veth ports. configure ips.
>> - ping between the ports to have tc rules.
>> - repeated few times: clear the tc rules like this: tc filter del dev veth1 
>> ingress and also on the 2nd port.
>> - set max-idle to 1 and remove it to cause a flush of the rules.
>> - create another set of veth ports. add/del veth4 from the bridge a few 
>> times to cause a sweep.
>> - before the fix: ovs-appctl upcall/show will show ukeys.
>> - after the fix upcall/show will show 0 ukeys.
>>
>>
>> what do you think?
>> I think if there is a cleaner way to do a sweep with purge=false as just
>> will just skip seq mismatch check and just set every ukey to delete.
>
> One problem here is that flow dumps are inherently racy.
> We can dump the same flow multiple times as well as not
> receive some flows that are actually in the datapath.
> So, we can't just remove ukeys if we missed a flow once
> or twice.  If we didn't see the flow in the dumps for the
> max-idle interval, that might be a better indicator that
> something is wrong.
>
> One more problem here however is that ukey statistics can
> keep increasing if there is some traffic that matches.
> Since the datapath flow doesn't exist, all this traffic
> will go to userspace updating the ukey stats, but the
> flow will never be installed again.  And we will not be
> able to fix that in this patch since the stats will grow.

Yes, I noticed this when trying to add a quick testcase for this. The entries 
were used, so it would never hit the code path.


> We need a way to tell that we haven't seen this flow in
> the dumps for a long time and stats are not a good indicator
> in this case.

Agreed, I made a rough testcase and a patch that does this. Please take a look 
and use whatever you need for your patch. I can also try to work on this if you 
want, but it will be another three weeks as I’m about to go on PTO.

Here are the patches (diffs):

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index 83609ec62..8b449537d 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -57,6 +57,7 @@ COVERAGE_DEFINE(dumped_inconsistent_flow);
 COVERAGE_DEFINE(dumped_new_flow);
 COVERAGE_DEFINE(handler_duplicate_upcall);
 COVERAGE_DEFINE(revalidate_missed_dp_flow);
+COVERAGE_DEFINE(revalidate_missing_dp_flow);
 COVERAGE_DEFINE(ukey_dp_change);
 COVERAGE_DEFINE(ukey_invalid_stat_reset);
 COVERAGE_DEFINE(ukey_replace_contention);
@@ -278,6 +279,7 @@ enum flow_del_reason {
     FDR_BAD_ODP_FIT,        /* Bad ODP flow fit. */
     FDR_FLOW_IDLE,          /* Flow idle timeout. */
     FDR_FLOW_LIMIT,         /* Kill all flows condition reached. */
+    FDR_FLOW_MISSING_DP,    /* Flow is missing from the datapath. */
     FDR_FLOW_WILDCARDED,    /* Flow needs a narrower wildcard mask. */
     FDR_NO_OFPROTO,         /* Bridge not found. */
     FDR_PURGE,              /* User requested flow deletion. */
@@ -315,6 +317,7 @@ struct udpif_key {
     struct dpif_flow_stats stats OVS_GUARDED; /* Last known stats.*/
     const char *dp_layer OVS_GUARDED;         /* Last known dp_layer. */
     long long int created OVS_GUARDED;        /* Estimate of creation time. */
+    long long int last_dumped OVS_GUARDED;    /* Flow last dump time. */
     uint64_t dump_seq OVS_GUARDED;            /* Tracks udpif->dump_seq. */
     uint64_t reval_seq OVS_GUARDED;           /* Tracks udpif->reval_seq. */
     enum ukey_state state OVS_GUARDED;        /* Tracks ukey lifetime. */
@@ -1819,6 +1822,7 @@ ukey_create__(const struct nlattr *key, size_t key_len,
     ukey->state_thread = ovsthread_id_self();
     ukey->state_where = OVS_SOURCE_LOCATOR;
     ukey->created = ukey->flow_time = time_msec();
+    ukey->last_dumped = 0;
     memset(&ukey->stats, 0, sizeof ukey->stats);
     ukey->stats.used = used;
     ukey->dp_layer = NULL;
@@ -2450,7 +2454,14 @@ revalidate_ukey(struct udpif *udpif, struct udpif_key 
*ukey,
         log_unexpected_stats_jump(ukey, stats);
     }

-    if (need_revalidate) {
+    if ((ukey->last_dumped ? ukey->last_dumped : ukey->created)
+        < udpif->dpif->current_ms - (2 * ofproto_max_idle)) {
+        /* If the flow was not dumped for at least twice the idle time,
+         * we can assume the datapath flow now longer exists and the ukey
+         * should be deleted. */
+        COVERAGE_INC(revalidate_missing_dp_flow);
+        *del_reason = FDR_FLOW_MISSING_DP;
+    } else if (need_revalidate) {
         if (should_revalidate(udpif, ukey, push.n_packets)) {
             if (!ukey->xcache) {
                 ukey->xcache = xlate_cache_new();
@@ -2884,6 +2895,7 @@ revalidate(struct revalidator *revalidator)
                 continue;
             }

+            ukey->last_dumped = now;
             ukey->offloaded = f->attrs.offloaded;
             if (!ukey->dp_layer
                 || (!dpif_synced_dp_layers(udpif->dpif)
diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index 7dbf0e97c..87a64d453 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -132,8 +132,8 @@ dnl Wait until the ARP flow has timed out.
 OVS_WAIT_UNTIL([test `ovs-appctl dpctl/dump-flows type=tc,offloaded | \
                   grep "arp" | wc -l` -eq 0])

-dnl Set max-idle to 200 ms, so we are sure the max-idle * 2 has been reached.
-AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:max-idle=200])
+dnl Set max-idle to 10 ms, so we are sure the max-idle * 2 has been reached.
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:max-idle=10])

 dnl Add a port to force the revalidation.
 REV_RECONFIGURE=$(ovs-appctl coverage/read-counter rev_reconfigure)
diff --git a/utilities/usdt-scripts/flow_reval_monitor.py 
b/utilities/usdt-scripts/flow_reval_monitor.py
index 28479a565..1de06c73b 100755
--- a/utilities/usdt-scripts/flow_reval_monitor.py
+++ b/utilities/usdt-scripts/flow_reval_monitor.py
@@ -249,6 +249,7 @@ FdrReasons = IntEnum(
         "FDR_BAD_ODP_FIT",
         "FDR_FLOW_IDLE",
         "FDR_FLOW_LIMIT",
+        "FDR_FLOW_MISSING_DP",
         "FDR_FLOW_WILDCARDED",
         "FDR_NO_OFPROTO",
         "FDR_PURGE",
@@ -265,6 +266,7 @@ FdrReasonStrings = {
     FdrReasons.FDR_BAD_ODP_FIT: "Bad ODP flow fit",
     FdrReasons.FDR_FLOW_IDLE: "Flow idle timeout",
     FdrReasons.FDR_FLOW_LIMIT: "Kill all flows condition reached",
+    FdrReasons.FDR_FLOW_MISSING_DP: "Flow is missing from the datapath",
     FdrReasons.FDR_FLOW_WILDCARDED: "Flow needs a narrower wildcard mask",
     FdrReasons.FDR_NO_OFPROTO: "Bridge not found",
     FdrReasons.FDR_PURGE: "User requested flow deletion",


diff --git a/tests/system-offloads-traffic.at b/tests/system-offloads-traffic.at
index d1da33d96..7dbf0e97c 100644
--- a/tests/system-offloads-traffic.at
+++ b/tests/system-offloads-traffic.at
@@ -93,6 +93,66 @@ AT_CHECK([ovs-appctl upcall/show | grep -E "offloaded flows 
: [[1-9]]"], [0], [i
 OVS_TRAFFIC_VSWITCHD_STOP
 AT_CLEANUP

+AT_SETUP([offloads - Forced removed datapath entries])
+OVS_TRAFFIC_VSWITCHD_START([], [], [-- set Open_vSwitch . 
other_config:hw-offload=true])
+
+AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])
+
+ADD_NAMESPACES(at_ns0, at_ns1)
+
+ADD_VETH(p0, at_ns0, br0, "10.1.1.1/24")
+ADD_VETH(p1, at_ns1, br0, "10.1.1.2/24")
+AT_CHECK([ovs-appctl dpctl/dump-flows], [0], [ignore])
+
+AT_CHECK([ovs-appctl vlog/set ofproto_dpif_upcall:dbg])
+
+dnl Set idle timeout to 1 second.
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:max-idle=1000])
+
+NS_CHECK_EXEC([at_ns0], [ping -q -c 10 -i 0.1 -W 2 10.1.1.2 | FORMAT_PING], 
[0], [dnl
+10 packets transmitted, 10 received, 0% packet loss, time 0ms
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep 
"eth_type(0x0800)" | DUMP_CLEAN_SORTED], [0], [dnl
+in_port(2),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:756, 
used:0.001s, actions:output
+in_port(3),eth(macs),eth_type(0x0800),ipv4(frag=no), packets:9, bytes:756, 
used:0.001s, actions:output
+])
+
+AT_CHECK([ovs-appctl dpctl/dump-flows type=tc,offloaded | grep -q "arp" ], [0])
+
+dnl Delete IPv4 entries, but keep the ARP ones.
+AT_CHECK([tc filter del dev ovs-p0 ingress protocol ip pref 2])
+AT_CHECK([tc filter del dev ovs-p1 ingress protocol ip pref 2])
+
+dnl Bring the remote ports down to avoid traffic.
+AT_CHECK([ip -n at_ns0 link set p0 down])
+AT_CHECK([ip -n at_ns1 link set p1 down])
+
+dnl Wait until the ARP flow has timed out.
+OVS_WAIT_UNTIL([test `ovs-appctl dpctl/dump-flows type=tc,offloaded | \
+                  grep "arp" | wc -l` -eq 0])
+
+dnl Set max-idle to 200 ms, so we are sure the max-idle * 2 has been reached.
+AT_CHECK([ovs-vsctl set Open_vSwitch . other_config:max-idle=200])
+
+dnl Add a port to force the revalidation.
+REV_RECONFIGURE=$(ovs-appctl coverage/read-counter rev_reconfigure)
+AT_CHECK([ovs-vsctl add-port br0 ovs-p2 -- set interface ovs-p2 type=internal])
+OVS_WAIT_UNTIL([test `ovs-appctl coverage/read-counter rev_reconfigure` \
+                  -gt $REV_RECONFIGURE])
+
+dnl Wait for another full round of revalidation.  After this all ukeys should
+dnl be gone.
+AT_CHECK([ovs-appctl revalidator/wait], [0])
+AT_CHECK([ovs-appctl revalidator/wait], [0])
+
+dnl Make sure no more ukeys exists.
+AT_CHECK([ovs-appctl upcall/show | grep '(keys' | awk '{print $3}' | \
+            grep -qv '0)'], [1])
+
+OVS_TRAFFIC_VSWITCHD_STOP(["/failed to flow_del (No such file or 
directory)/d"])
+AT_CLEANUP
+
 AT_SETUP([offloads - set ingress_policing_rate and ingress_policing_burst - 
offloads disabled])
 AT_KEYWORDS([ingress_policing])
 OVS_CHECK_TC_QDISC()




_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to