On 3/14/26 5:13 PM, Fernando Fernandez Mancera wrote:
Hi,

On 3/14/26 3:03 PM, Salvatore Bonaccorso wrote:
Control: forwarded -1 https://lore.kernel.org/ regressions/[email protected]
Control: tags -1 + upstream

Hi

In Debian, in https://bugs.debian.org/1130336, Alejandro reported that
after updates including 69894e5b4c5e ("netfilter: nft_connlimit:
update the count if add was skipped"), when the following rule is set

    iptables -A INPUT -p tcp -m connlimit --connlimit-above 111 -j REJECT --reject-with tcp-reset

connections get stuck accordingly, it can be easily reproduced by:

# iptables -A INPUT -p tcp -m connlimit --connlimit-above 111 -j REJECT --reject-with tcp-reset
# nft list ruleset
# Warning: table ip filter is managed by iptables-nft, do not touch!
table ip filter {
         chain INPUT {
                 type filter hook input priority filter; policy accept;
                 ip protocol tcp xt match "connlimit" counter packets 0 bytes 0 reject with tcp reset
         }
}
# wget -O /dev/null https://git.kernel.org/torvalds/t/linux-7.0- rc3.tar.gz --2026-03-14 14:53:51--  https://git.kernel.org/torvalds/t/linux-7.0- rc3.tar.gz Resolving git.kernel.org (git.kernel.org)... 172.105.64.184, 2a01:7e01:e001:937:0:1991:8:25 Connecting to git.kernel.org (git.kernel.org)|172.105.64.184|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/ linux.git/snapshot/linux-7.0-rc3.tar.gz [following] --2026-03-14 14:53:51--  https://git.kernel.org/pub/scm/linux/kernel/ git/torvalds/linux.git/snapshot/linux-7.0-rc3.tar.gz
Reusing existing connection to git.kernel.org:443.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/x-gzip]
Saving to: ‘/dev/null’

/dev/null                         [ <=>                    ] 248.03M  51.9MB/s    in 5.0s

2026-03-14 14:53:56 (49.3 MB/s) - ‘/dev/null’ saved [260080129]

# wget -O /dev/null https://git.kernel.org/torvalds/t/linux-7.0- rc3.tar.gz --2026-03-14 14:53:58--  https://git.kernel.org/torvalds/t/linux-7.0- rc3.tar.gz Resolving git.kernel.org (git.kernel.org)... 172.105.64.184, 2a01:7e01:e001:937:0:1991:8:25 Connecting to git.kernel.org (git.kernel.org)|172.105.64.184|:443... failed: Connection timed out. Connecting to git.kernel.org (git.kernel.org)| 2a01:7e01:e001:937:0:1991:8:25|:443... failed: Network is unreachable.

Before the 69894e5b4c5e ("netfilter: nft_connlimit: update the count
if add was skipped") commit this worked.


Thanks for the report. I have reproduced this on upstream kernel. I am working on it.


This is what is happening:

1. The first connection is established and tracked, all good. When it finishes, it goes to TIME_WAIT state 2. The second connection is established, ct is confirmed since the beginning, skipping the tracking and calling a GC. 3. The previously tracked connection is cleaned up during GC as TIME_WAIT is considered closed.
4. count is therefore 0 and xt performs a drop.

There are two different approaches to fix this IMHO.

The first one would be to stop considering TIME_WAIT as closed. But that would artificially solve the issue.

The second one is to check what is the TCP status inside the nf_ct_is_confirmed() check and if it is SENT or RECV but confirmed there are two options - ore it is a retransmission or the ct was confirmed even before we tracked it. In both situations, perform an insert with a GC. Then we make sure no duplicate tracking is happening and the connection is tracked properly. The following diff fixes it, what do you think? I can send a formal patch if this solution is considered acceptable.

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 00eed5b4d1b1..ae94e5d7e00b 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -78,6 +78,15 @@ static inline bool already_closed(const struct nf_conn *conn)
                return false;
 }

+static inline bool tcp_syn_sent_or_recv(const struct nf_conn *conn)
+{
+       if (nf_ct_protonum(conn) == IPPROTO_TCP)
+               return conn->proto.tcp.state == TCP_CONNTRACK_SYN_SENT ||
+                      conn->proto.tcp.state == TCP_CONNTRACK_SYN_RECV;
+       else
+               return false;
+}
+
 static int key_diff(const u32 *a, const u32 *b, unsigned int klen)
 {
        return memcmp(a, b, klen * sizeof(u32));
@@ -183,6 +192,9 @@ static int __nf_conncount_add(struct net *net,
                 * might have happened before hitting connlimit
                 */
                if (skb->skb_iif != LOOPBACK_IFINDEX) {
+                       if (tcp_syn_sent_or_recv(ct))
+                               goto check_connections;
+
                        err = -EEXIST;
                        goto out_put;
                }

Thanks,
Fernando.

Reply via email to