On 3/14/26 8:25 PM, Florian Westphal wrote:
Fernando Fernandez Mancera <[email protected]> wrote:
On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
Hi,

On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
Control: forwarded -1
https://lore.kernel.org/ 
regressions/[email protected]
Control: tags -1 + upstream

Hi

In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
update the count if add was skipped"), when the following rule is set

     iptables -A INPUT -p tcp -m
connlimit --connlimit-above 111 -j
REJECT --reject-with tcp-reset

connections get stuck accordingly, it can be easily reproduced by:

# iptables -A INPUT -p tcp -m connlimit
--connlimit-above 111 -j REJECT
--reject-with tcp-reset
# nft list ruleset
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
          chain INPUT {
                  type filter hook input priority filter; policy accept;
                  ip protocol tcp xt
match "connlimit" counter packets 0
bytes 0 reject with tcp reset
          }
}
# wget -O /dev/null
https://git.kernel.org/torvalds/t/linux-7.0-
rc3.tar.gz
--2026-03-14 14:53:51--
https://git.kernel.org/torvalds/t/linux-7.0-
rc3.tar.gz
Resolving git.kernel.org
(git.kernel.org)... 172.105.64.184,
2a01:7e01:e001:937:0:1991:8:25
Connecting to git.kernel.org
(git.kernel.org)|172.105.64.184|:443...
connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/
linux.git/snapshot/linux-7.0-rc3.tar.gz
[following]
--2026-03-14 14:53:51--
https://git.kernel.org/pub/scm/linux/kernel/ 
git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
Reusing existing connection to git.kernel.org:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘/dev/null’

/dev/null                         [
<=>                    ] 248.03M
51.9MB/s    in 5.0s

2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]

# wget -O /dev/null
https://git.kernel.org/torvalds/t/linux-7.0-
rc3.tar.gz
--2026-03-14 14:53:58--
https://git.kernel.org/torvalds/t/linux-7.0-
rc3.tar.gz
Resolving git.kernel.org
(git.kernel.org)... 172.105.64.184,
2a01:7e01:e001:937:0:1991:8:25
Connecting to git.kernel.org
(git.kernel.org)|172.105.64.184|:443...
failed: Connection timed out.
Connecting to git.kernel.org
(git.kernel.org)|
2a01:7e01:e001:937:0:1991:8:25|:443...
failed: Network is unreachable.

Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
if add was skipped") commit this worked.


Thanks for the report. I have reproduced
this on upstream kernel. I am working on it.


This is what is happening:

1. The first connection is established and
tracked, all good. When it finishes, it goes to
TIME_WAIT state
2. The second connection is established, ct is
confirmed since the beginning, skipping the
tracking and calling a GC.
3. The previously tracked connection is cleaned
up during GC as TIME_WAIT is considered closed.

This is stupid.  The fix is to add --syn or use
OUTPUT.  Its not even clear to me what the user wants to achive with this rule.


Yes, the ruleset shown does not make sense. Having said this, it could affect to a soft-limit scenario as the one described on the blamed commit..

xt_connlimit was designed with --syn on mind but it was not enforced and people used it for many different things. At least, we are learning many people ignored --syn completely.

+static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
+{
+       if (nf_ct_protonum(conn) == IPPROTO_TCP)
+               return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
+                      conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
+       else
+               return false;
+}

We're adding ever more complex checks in the conncount backend.
I don't like any of the solutions.


As we are already fetching the ct.. would it be fine if instead we go for a protocol agnostic solution with:

if (ctinfo == IP_CT_NEW)
        goto check_connections;

inside the confirmed if statement? If I am not wrong, it should be a valid solution too and IMHO a better one.

What about reverting the offending commit, at least for tree_count?
That way it continues to work as it did in the past.


Before the fix, soft-limiting scenarios were broken and therefore this specific ruleset was too. I hope this is not a ruleset in production and it is just for reproducing the issue.

P.S: I have been investigating on a way to improve conncount backend structure so the GC is not that expensive.. I don't have anything relevant yet but I plan to provide some updates.

Reply via email to