Re: [ovs-dev] netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch.

Fernando Fernandez Mancera Mon, 15 Dec 2025 07:39:13 -0800


On 12/15/25 12:07 PM, Rukomoinikova Aleksandra wrote:

On 15.12.2025 14:00, Fernando Fernandez Mancera wrote:

On 12/12/25 10:27 PM, Rukomoinikova Aleksandra wrote:

Hi one more time! I found another issue. I'll describe it below.

In my opinion, it's relevant after merging [1]  I saw that a fix for
this commit was merged last week, but but it doesn't fix case I'll
describe below.


Just to be sure, you have tested this with the latest mainline kernel,
right? Because as you mentioned we merged several fixes addressing
outdated counts.

HI! yes

I create limits via Open vSwitch and run a TCP flood this way: hping3 -S
-I host -p 10880 -i u5 10.255.41.101 -c 100 (Here, -i u5 is important,
meaning with timers < jiffies; it's more likely the issue won't
reproduce otherwise)
 And I set the following limit on the zone where these connections
arrive: ovs-dpctl ct-set-limits zone=9,limit=100

I start traffic, I see traffic on the interface: TCP SYN -> TCP SYN+ACK
-> TCP RST. Zone 9 overflows, but connections immediately become CLOSED.
I run hping3 again: hping3 -S -I host -p 10880 -i u5 10.255.41.101 -c
100 - I don't see anything on the interface and I see messages in dmesg
from openvswitch saying the number of connections exceeds the limit. At
this moment, if I call ovs-dpctl ct-get-limits, traffic will immediately
start flowing again because the limit will be reset to zero.


To me it seems that openvswitch should perform a GC somewhere similar to
what we did on nft_connlimit/xt_connlimit.

hm, Thanks! I'll try to think about implementing this.

I investigated this and managed to reproduce it. The hping3 with theoption you mentioned is sending a burst of packets that makesopenvswitch to process more than 8 packets per jiffy.


That is relevant because 2 different things:

1. Currently __nf_conncount_add() stops cleaning up connections after 8connections.


2. The optimization introduced to only perform one GC per jiffy.

Therefore, if someone manage to pass traffic fast enough you can growthat number quite fast.

I would like to propose to increase that number to 64, although it mightmake sense to match net.core.netdev_budget default value which is 300currently. I have tested both values with OVS and solves the problem.


Any opinion? What do you think?

Also, just as a note: this issue was introduced by d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC"), I bisected it.

I think the problem is as follows: before commit [1], we called
__nf_conncount_gc_list for every connection, and this function iterates
over all connections and cleans up those already closed.

What we have now is that when trying to add a connection in
__nf_conncount_add, if we don't find it, then while handling errors, we
don't continue the iteration further and immediately exit the function
with zero, which represents the current connection count - we will
clean> connections in the list only until we find the connection we
want to
commit now - meaning the connection count will become outdated.

Furthermore, we then go to check already_closed found connections and
iterate collect variable, which also doesn't allow connections to be
fully cleaned up; we will clean up a maximum of 8
(CONNCOUNT_GC_MAX_NODES) entries per one call to __nf_conncount_add.

Also, with such a TCP attack - when connections transition to CLOSED
immediately - __nf_conncount_gc_list won't be called at all, because we
will constantly be calling __nf_conncount_add and updating the last_gc
status. That's why ovs-dpctl ct-get-limits helps; it simply calls
__nf_conncount_gc_list and cleans up all closed connections.
I propose the following behavior, which will be similar to what we had
before [2]

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 19039a0802b8..e5224785f01e 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -171,6 +171,7 @@ static int __nf_conncount_add(struct net *net,
        struct nf_conn *found_ct;
        unsigned int collect = 0;
        bool refcounted = false;
+    bool need_add = false;

        if (!get_ct_or_tuple_from_skb(net, skb, l3num, &ct, &tuple,
&zone,
&refcounted))
            return -ENOENT;
@@ -196,7 +197,8 @@ static int __nf_conncount_add(struct net *net,
                    if (nf_ct_tuple_equal(&conn->tuple, &tuple) &&
                        nf_ct_zone_id(&conn->zone, conn->zone.dir) ==
                        nf_ct_zone_id(zone, zone->dir))
-                    goto out_put; /* already exists */
+                    /* already exists */
+                    need_add = false;


I don't see the point to continue here. If we reached this, it means the
ct is already tracked. Sure, the count is not being updated but does it
matter? This connection is already tracked.

                } else {
                    collect++;
                }
@@ -214,7 +216,7 @@ static int __nf_conncount_add(struct net *net,
                 * Attempt to avoid a re-add in this case.
                 */
                nf_ct_put(found_ct);
-            goto out_put;
+            need_add = false;
            } else if (already_closed(found_ct)) {
                /*
                 * we do not care about connections which are
@@ -222,13 +224,16 @@ static int __nf_conncount_add(struct net *net,
                 */
                nf_ct_put(found_ct);
                conn_free(list, conn);
-            collect++;
                continue;
            }


This worries me a bit, it would mean that for every legit add operation
the function will go through ALL the connections tracked which might be
a really huge number. This would impact the performance.

IMO, this is the only relevant line on the patch for your use-case
probably. I do not think the others have any impact. I am wondering if
this can be fixed by handling it from openvswitch side by calling gc
when needed.

I am going to try to reproduce this locally on a machine I have. Let's
see what I can get.

Thanks!


Thanks,
Fernando.


            nf_ct_put(found_ct);
        }

+    if (!need_add) {
+        goto out_put;
+    }
+
    add_new_node:

[1] netfilter: nf_conncount: reduce unnecessary GC commit
https://github.com/torvalds/linux/commit/d265929930e2ffafc744c0ae05fb70acd53be1ee

[2] netfilter: nf_conncount: merge lookup and add functions commit.
https://github.com/torvalds/linux/commit/df4a902509766897f7371fdfa4c3bf8bc321b55d

[3] netfilter: nft_connlimit: update the count if add was skipped
https://github.com/torvalds/linux/commit/69894e5b4c5e28cda5f32af33d4a92b7a4b93b0e


Do you think I missed any cases, and how will this affect the function's
performance? Thanks)


_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] netfilter: nf_conncount: cpu soft lockup using limiting with Open vSwitch.

Reply via email to