On 19 Dec 2025, at 19:42, Gleb Smirnoff <[email protected]> wrote:
>
> On Fri, Dec 19, 2025 at 12:10:09PM +0100, Kristof Provost wrote:
> K> I’m seeing panics on pfsync interface destruction now:
> K>
> K> panic: mld_change_state: bad ifp
> K> cpuid = 19
> K> time = 1766142554
> K> KDB: stack backtrace:
> K> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> K> 0xfffffe01843fd990
> K> vpanic() at vpanic+0x136/frame 0xfffffe01843fdac0
> K> panic() at panic+0x43/frame 0xfffffe01843fdb20
> K> mld_change_state() at mld_change_state+0x6d0/frame 0xfffffe01843fdb90
> K> in6_leavegroup_locked() at in6_leavegroup_locked+0xa9/frame
> K> 0xfffffe01843fdbf0
> K> in6_leavegroup() at in6_leavegroup+0x32/frame 0xfffffe01843fdc10
> K> pfsync_multicast_cleanup() at pfsync_multicast_cleanup+0x83/frame
> K> 0xfffffe01843fdc40
> K> pfsync_clone_destroy() at pfsync_clone_destroy+0x260/frame
> K> 0xfffffe01843fdc90
> K> ifc_simple_destroy_wrapper() at ifc_simple_destroy_wrapper+0x26/frame
> K> 0xfffffe01843fdca0
> K> if_clone_destroyif_flags() at if_clone_destroyif_flags+0x69/frame
> K> 0xfffffe01843fdce0
> K> if_clone_detach() at if_clone_detach+0xe6/frame 0xfffffe01843fdd10
> K> vnet_pfsync_uninit() at vnet_pfsync_uninit+0xf0/frame 0xfffffe01843fdd30
> K> vnet_destroy() at vnet_destroy+0x154/frame 0xfffffe01843fdd60
> K> prison_deref() at prison_deref+0xaf5/frame 0xfffffe01843fddd0
> K> sys_jail_remove() at sys_jail_remove+0x15c/frame 0xfffffe01843fde00
> K> amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe01843fdf30
> K> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe01843fdf30
> K> --- syscall (508, FreeBSD ELF64, jail_remove), rip = 0x2d8234c9e31a, rsp =
> K> 0x2d823179b928, rbp = 0x2d823179b9b0 ---
> K> KDB: enter: panic
> K>
> K> The pfsync:basic_ipv6 seems to trigger this reliably.
>
> This actually surfaced an interesting problem, and pfsync being an interface
> isn't a culprit here :) Neither my changes are.
>
> The problem is that IPv6 multicast layer in in6_getmulti() will call into
> interface multicast layer with if_addmulti() to allocate struct ifmultiaddr.
> This new born ifmultiaddr will have refcount of 1, but it will be referenced
> both by the struct in6_multi and the interface linked list. It should have
> refcount of 2. For all normal cases the in6_multi structs are also somehow
> associated with the interface they were allocated for and at teardown sequence
> they will go away all together, so this refcounting bug never triggers.
>
> But with pfsync calling in6_joingroup() on some ifnet from its own pfsync's
> context we come into a situation when the struct in6_multi is external to the
> ifnet it is associated with. If this ifnet is detached before pfsync context
> is destroyed, then our in6_multi will point at a detached ifnet that is
> hanging
> on the last reference (all methods point to if_dead) and this in6_multi will
> also point at freed ifmultiaddr.
>
> I'm looking at either a proper fix or at hiding it back under carper as it was
> before.
What I'm seeing with main-n282652-4100bd6caa66 is this:
panic: bpf_ifnet_write: ifp 0xfffff8002d492800 type 209 not supported
...
(kgdb) #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
td = <optimized out>
#1 doadump (textdump=textdump@entry=0)
at /usr/src/sys/kern/kern_shutdown.c:399
error = 0
coredump = <optimized out>
...
#10 0xffffffff80b9776b in vpanic (
fmt=0xffffffff811e552a "%s: ifp %p type %u not supported",
ap=ap@entry=0xfffffe00d75f9c00) at /usr/src/sys/kern/kern_shutdown.c:962
buf = "bpf_ifnet_write: ifp 0xfffff8002d492800 type 209 not supported",
'\000' <repeats 193 times>
__pc = 0x0
__pc = 0x0
__pc = 0x0
other_cpus = {__bits = {65534, 0 <repeats 15 times>}}
td = 0xfffff8000d8c7780
bootopt = <optimized out>
newpanic = <optimized out>
#11 0xffffffff80b975d3 in panic (
fmt=0xffffffff81d9fa50 <cnputs_mtx> "\256`\035\201\377\377\377\377")
at /usr/src/sys/kern/kern_shutdown.c:887
ap = {{gp_offset = 32, fp_offset = 48,
overflow_arg_area = 0xfffffe00d75f9c30,
reg_save_area = 0xfffffe00d75f9bd0}}
#12 0xffffffff80cd600f in bpf_ifnet_write (arg=0xfffff8002d492800,
m=0xfffff80033020900, mc=0x0, flags=32) at /usr/src/sys/net/bpf_ifnet.c:141
ro = {ro_nh = 0x0, ro_lle = 0x0, ro_prepend = 0x0, ro_plen = 0,
ro_flags = 0, ro_mtu = 0, spare = 0, ro_dst = {sa_len = 0 '\000',
sa_family = 0 '\000', sa_data = '\000' <repeats 13 times>}}
dst = {sa_len = 0 '\000', sa_family = 0 '\000',
sa_data = '\000' <repeats 13 times>}
hlen = 0
saved_vnet = <optimized out>
error = <optimized out>
ifp = <optimized out>
#13 0xffffffff80cd2030 in bpfwrite (dev=<optimized out>, uio=<optimized out>,
ioflag=<optimized out>) at /usr/src/sys/net/bpf.c:1052
et = {et_link = {tqe_next = 0x0, tqe_prev = 0xfffffe00167c2ad8},
et_td = 0xfffff8000d8c7780, et_section = {bucket = 1},
et_old_priority = 144 '\220'}
d = 0xfffff8000eec8a00
error = <optimized out>
bp = 0xfffff80001aa3d00
m = 0xfffff80033020900
mc = 0x0
len = <optimized out>
#14 0xffffffff80a114a3 in devfs_write_f (fp=0xfffff8000d884140,
uio=0xfffff80001aa3c80, cred=<optimized out>, flags=0,
td=0xfffff8000d8c7780) at /usr/src/sys/fs/devfs/devfs_vnops.c:1960
dev = 0xfffff800017fc800
ref = 1
fpop = 0x0
dsw = 0xffffffff81af2b68 <bpf_cdevsw>
error = 0
ioflag = 0
resid = 342
-Dimitry