Re[2]: Kernel deadlocks on 14.3-STABLE with 100GbE card

Paul Fri, 29 Aug 2025 11:33:54 -0700

Hi Zhenlei,

Thanks for a suggestion.


But is there a reason not to trust a core dump? 
Especially when the sum of all `mbuf`s matches the value show in frame stack 
exactly.

> 
> 
> > On Aug 29, 2025, at 5:08 PM, Paul <de...@ukr.net> wrote:
> > 
> > 
> > Hi!
> > 
> > 
> > We have finally managed to reproduce this issue with the help of iperf3.
> > 
> > We have triggered a kernel panic with `sysctl debug.kdb.panic=1` to collect 
> > core dump, when iperf3 process has entered the inf loop.
> > 
> > Here is the basic analysis, please ask for more if required:
> > 
> > (kgdb) bt
> > #0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1530
> > #1  0xffffffff808deec8 in ipi_nmi_handler () at 
> > /usr/src/sys/x86/x86/mp_x86.c:1487
> > #2  0xffffffff8090c7af in trap (frame=0xfffffe03edeb8f30) at 
> > /usr/src/sys/amd64/amd64/trap.c:248
> > #3  <signal handler called>
> > #4  0xffffffff80640e30 in sbcut_internal (sb=sb@entry=0xfffff801b0ec6e00, 
> > len=-2145162648) at /usr/src/sys/kern/uipc_sockbuf.c:1585
> > #5  0xffffffff80640d78 in sbflush_internal (sb=<optimized out>) at 
> > /usr/src/sys/kern/uipc_sockbuf.c:1547
> > #6  sbflush_locked (sb=<optimized out>) at 
> > /usr/src/sys/kern/uipc_sockbuf.c:1559
> > #7  sbflush (sb=sb@entry=0xfffff801b0ec6e00) at 
> > /usr/src/sys/kern/uipc_sockbuf.c:1567
> > #8  0xffffffff807488f3 in tcp_disconnect (tp=0xfffff8034a572a80) at 
> > /usr/src/sys/netinet/tcp_usrreq.c:2702
> > #9  0xffffffff80743897 in tcp_usr_disconnect (so=<optimized out>) at 
> > /usr/src/sys/netinet/tcp_usrreq.c:704
> > #10 0xffffffff80643655 in sodisconnect (so=0xfffff801b0ec6c00) at 
> > /usr/src/sys/kern/uipc_socket.c:2085
> > #11 soclose (so=0xfffff801b0ec6c00) at /usr/src/sys/kern/uipc_socket.c:1920
> > #12 0xffffffff8053e921 in fo_close (fp=0xfffff801b0ec6e00, 
> > fp@entry=0xfffff801a51ab410, td=0x80236a68, td@entry=0xfffff801a51ab410) at 
> > /usr/src/sys/sys/file.h:397
> > #13 _fdrop (fp=0xfffff801b0ec6e00, fp@entry=0xfffff801a51ab410, 
> > td=0x80236a68, td@entry=0xfffff80276bcd740) at 
> > /usr/src/sys/kern/kern_descrip.c:3756
> > #14 0xffffffff80541aca in closef (fp=0xfffff801a51ab410, 
> > td=0xfffff80276bcd740) at /usr/src/sys/kern/kern_descrip.c:2851
> > #15 0xffffffff80545e08 in closefp_impl (fdp=<optimized out>, fd=<optimized 
> > out>, fp=<optimized out>, td=<optimized out>, audit=<optimized out>) at 
> > /usr/src/sys/kern/kern_descrip.c:1324
> > #16 0xffffffff8090de97 in syscallenter (td=0xfffff80276bcd740) at 
> > /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:193
> > #17 amd64_syscall (td=0xfffff80276bcd740, traced=0) at 
> > /usr/src/sys/amd64/amd64/trap.c:1241
> > #18 <signal handler called>
> > #19 0x000000082510c87a in ?? ()
> > Backtrace stopped: Cannot access memory at address 0x820dd0058
> > (kgdb) fr 4
> > #4  0xffffffff80640e30 in sbcut_internal (sb=sb@entry=0xfffff801b0ec6e00, 
> > len=-2145162648) at /usr/src/sys/kern/uipc_sockbuf.c:1585
> > 1585                next = (m = sb->sb_mb) ? m->m_nextpkt : 0;
> > (kgdb) p len
> > $33 = -2145162648
> > (kgdb) set $total=(unsigned int)0
> > (kgdb) set $count=(unsigned int)0
> > (kgdb) set $next=(struct mbuf*)sb->sb_mb
> > (kgdb) while ($next != 0)
> >> set $total=$total+$next.m_len
> >> set $count=$count+1
> >> set $next=$next.m_next
> >> end
> > (kgdb) p $total
> > $34 = 2149804648
> > (kgdb) p (int)$total
> > $35 = -2145162648
> > (kgdb) p $count
> > $36 = 1484679
> > 
> > 
> > As mentioned before, the problem occurs when the socket is being closed. 
> > Now we know why. Because of a cast here:
> > 
> > m_freem(sbcut_internal(sb, (int)sb->sb_ccc));
> > 
> > When `sb->sb_ccc` grows above the max unsigned value that can be stored in 
> > `int` this cast leads to an infinite 
> > loop, within this function. As `len` smaller than 0 is basically equivalent 
> > to 0 in `sbcut_internal()`.
> 
> Just a note. There's KASSERT in sbcut_internal() to check parameter len,
> 
> ```
> static struct mbuf *
> sbcut_internal(struct sockbuf *sb, int len)
> {
>         struct mbuf *m, *next, *mfree;
>         bool is_tls;
> 
>         KASSERT(len >= 0, ("%s: len is %d but it is supposed to be >= 0",
>             __func__, len));
> ...
> }
> ```
> 
> so you can retest with kernel `options INVARIANTS` on  to verify that, if the 
> overflow occurs.
> 
> > 
> > But that's just a part of a problem. Why does the buffer grow this large? 
> > Our limit is:
> > 
> > kern.ipc.maxsockbuf=157286400
> > 
> > Is it expected to grow so far beyond this limit?
> > 
> > 
> > The way we managed to reproduce the issue is to simply spam one host with a 
> > traffic from another host:
> > 
> > Client:
> > 
> > iperf3 --parallel 8 --time 10 --bidir --client <server-IP>
> > 
> > Server (where bug occurs):
> > 
> > iperf3 --server
> > 
> > 
> > My guess is the limit is not applied on packet basis. But instead, at some 
> > other trigger points.
> > And when there is a burst we manage to accumulate so many packets that 
> > their total size becomes > 2147483647.
> > The fact that this is a 100GbE card makes it much more likely.
> > 
> >> Hi!
> >> It has been a 4th time now that our server had to be hard re-booted. Last 
> >> two of them in the span of two hours.
> >> It was only a week since the server was in production.
> >> 
> >> 
> >> ...
> >> 
> > 
> 
> 
> Best regards,
> Zhenlei
>

Re[2]: Kernel deadlocks on 14.3-STABLE with 100GbE card

Reply via email to