On Wed, Mar 6, 2024 at 10:56 AM Matthew L. Dailey
<matthew.l.dai...@dartmouth.edu> wrote:
>
> Posting a few updates on this issue.
>
> I was able to induce a panic on a CURRENT kernel (20240215), built with
> GENERIC-KASAN and running kern.kstack_pages=6 (default) after ~189
> hours. The panic message and backtrace are below - please reach out
> directly if you'd like to have a look at the core. I'm specifically not
> increasing kstack_pages to see what effect this has on the panics.
>
> I have another system running CURRENT (20240215) without any debugging.
> I'm able to regularly panic this (7 panics over two weeks with average
> uptime of ~42 hours) with kstack_pages=4. I set kstack_pages=6, and have
> also induced several panics. Oddly, this seems to happen more quickly (4
> panics over 2 days with average uptime of ~10.5 hours).
>
> Another system running CURRENT (20240208), built with GENERIC-KASAN and
> running kern.kstack_pages=8 has now been running with our hdf5 workload
> non-stop since February 10th with no panics or issues.
>
>  From all this, it seems like increasing kstack_pages to 8 eliminates
> the panics, but obviously doesn't fix the underlying cause of the
> issues. So, although this is an obvious workaround for our production
> system, it would be better to find and fix the underlying cause of the
> panics.
>
> I'm going to continue testing to try to generate more cores with
> kstack_pages<8 on the system with the debug kernel.
>
> Any other ideas to try to narrow down the cause are welcome.
You might try increasing KSTACK_GUARD_PAGES.
I've never done this, but I think it must be done by editting/rebuilding from
patched sources.
It is in /usr/src/sys/amd64/include/param.h.

Typically, when KSTACK_GUARD_PAGES is 1, when the stack grows
down into the guard page, a double fault occurs.

For your case, it looks like something might be modifying a location that
is a ways below (but within the 8 pages) of where the stack is.
Increasing KSTACK_GUARD_PAGES to something like 4 and then using
the default kstack_pages, it might get the double faults when the wild pointer
first occurs?

Another approach would be to get rid of everything else configured in these
test systems, to see if they are involved in the crashes. (I vaguely remember
you mentioning packet filtering as one example.)

>
> Thanks,
> Matt
>
> [680940] Kernel page fault with the following non-sleepable locks held:
> [680940] exclusive sleep mutex nfs_state_mutex (nfs_state_mutex) r = 0
> (0xffffffff830498e0) locked @ /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:6652
> [680940] stack backtrace:
> [680940] #0 0xffffffff8127958f at witness_debugger+0x13f
> [680940] #1 0xffffffff8127b114 at witness_warn+0x674
> [680940] #2 0xffffffff81aba0a6 at trap_pfault+0x116
> [680940] #3 0xffffffff81ab901c at trap+0x54c
> [680940] #4 0xffffffff81a75988 at calltrap+0x8
> [680940] #5 0xffffffff80fb4bfa at nfsrv_freestateid+0x23a
> [680940] #6 0xffffffff80fd5e3f at nfsrvd_freestateid+0x1df
> [680940] #7 0xffffffff80f98d35 at nfsrvd_dorpc+0x2585
> [680940] #8 0xffffffff80fbf588 at nfssvc_program+0x1078
> [680940] #9 0xffffffff8173fce6 at svc_run_internal+0x1706
> [680940] #10 0xffffffff8174094b at svc_thread_start+0xb
> [680940] #11 0xffffffff811137a3 at fork_exit+0xa3
> [680940] #12 0xffffffff81a769ee at fork_trampoline+0xe
> [680940]
> [680940]
> [680940] Fatal trap 12: page fault while in kernel mode
> [680940] cpuid = 3; apic id = 06
> [680940] fault virtual address  = 0x7
> [680940] fault code             = supervisor read data, page not present
> [680940] instruction pointer    = 0x20:0xffffffff80fafd67
> [680940] stack pointer          = 0x28:0xfffffe0153ba2de0
> [680940] frame pointer          = 0x28:0xfffffe0153ba2eb0
> [680940] code segment           = base 0x0, limit 0xfffff, type 0x1b
> [680940]                        = DPL 0, pres 1, long 1, def32 0, gran 1
> [680940] processor eflags       = interrupt enabled, resume, IOPL = 0
> [680940] current process                = 55202 (nfsd: service)
> [680940] rdi: 0000000000000007 rsi: 0000000000000000 rdx: dffff7c000000000
> [680940] rcx: fffffe001b9ec1e8  r8: 0012c43500000002  r9: 0012c43500000002
> [680940] rax: fffffe001b9ec1e8 rbx: ffffffffffffffff rbp: fffffe0153ba2eb0
> [680940] r10: 0000000000000004 r11: 0000000000000006 r12: 0000000000000007
> [680940] r13: fffffe019cd75700 r14: 0000000000001a1a r15: fffffe019cd75708
> [680940] trap number            = 12
> [680940] panic: page fault
> [680940] cpuid = 3
> [680940] time = 1709646178
> [680940] KDB: stack backtrace:
> [680940] db_trace_self_wrapper() at db_trace_self_wrapper+0xa5/frame
> 0xfffffe0153ba2550
> [680940] kdb_backtrace() at kdb_backtrace+0xc6/frame 0xfffffe0153ba26b0
> [680940] vpanic() at vpanic+0x210/frame 0xfffffe0153ba2850
> [680940] panic() at panic+0xb5/frame 0xfffffe0153ba2910
> [680940] trap_fatal() at trap_fatal+0x65b/frame 0xfffffe0153ba2a10
> [680940] trap_pfault() at trap_pfault+0x12b/frame 0xfffffe0153ba2b30
> [680940] trap() at trap+0x54c/frame 0xfffffe0153ba2d10
> [680940] calltrap() at calltrap+0x8/frame 0xfffffe0153ba2d10
> [680940] --- trap 0xc, rip = 0xffffffff80fafd67, rsp =
> 0xfffffe0153ba2de0, rbp = 0xfffffe0153ba2eb0 ---
> [680940] nfsrv_freelockowner() at nfsrv_freelockowner+0x97/frame
This seems to indicate that a list pointer (the one called ls_list in the
nfsstateid structure is bogus). However the other list in the same structure
(called ls_hash) seems ok, since it would not have gotten here otherwise.
Both these lists are set to-gether with the structure is added, so it doesn't
make sense that one would be bogus, unless some wild pointer overwrote
that location, I think?

rick

> 0xfffffe0153ba2eb0
> [680940] nfsrv_freestateid() at nfsrv_freestateid+0x23a/frame
> 0xfffffe0153ba2f70
> [680940] nfsrvd_freestateid() at nfsrvd_freestateid+0x1df/frame
> 0xfffffe0153ba3030
> [680940] nfsrvd_dorpc() at nfsrvd_dorpc+0x2585/frame 0xfffffe0153ba3570
> [680940] nfssvc_program() at nfssvc_program+0x1078/frame 0xfffffe0153ba3970
> [680940] svc_run_internal() at svc_run_internal+0x1706/frame
> 0xfffffe0153ba3ee0
> [680940] svc_thread_start() at svc_thread_start+0xb/frame 0xfffffe0153ba3ef0
> [680940] fork_exit() at fork_exit+0xa3/frame 0xfffffe0153ba3f30
> [680940] fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0153ba3f30
> [680940] --- trap 0xc, rip = 0x3b4ff896f0da, rsp = 0x3b4ff6a500e8, rbp =
> 0x3b4ff6a50380 ---
> [680940] KDB: enter: panic
>
> #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57
> #1  doadump (textdump=textdump@entry=0)
>      at /usr/src/sys/kern/kern_shutdown.c:403
> #2  0xffffffff805a52f6 in db_dump (dummy=<optimized out>,
>      dummy2=<optimized out>, dummy3=<optimized out>, dummy4=<optimized out>)
>      at /usr/src/sys/ddb/db_command.c:590
> #3  0xffffffff805a512b in db_command (last_cmdp=<optimized out>,
>      cmd_table=<optimized out>, dopager=true)
>      at /usr/src/sys/ddb/db_command.c:503
> #4  0xffffffff805a4b3d in db_command_loop ()
>      at /usr/src/sys/ddb/db_command.c:550
> #5  0xffffffff805aa209 in db_trap (type=<optimized out>, code=<optimized
> out>)
>      at /usr/src/sys/ddb/db_main.c:267
> #6  0xffffffff81239f25 in kdb_trap (type=3, code=code@entry=0,
>      tf=tf@entry=0xfffffe0153ba25f0) at /usr/src/sys/kern/subr_kdb.c:790
> #7  0xffffffff81ab8f88 in trap (frame=0xfffffe0153ba25f0)
>      at /usr/src/sys/amd64/amd64/trap.c:606
> #8  <signal handler called>
> #9  kdb_enter (why=<optimized out>, msg=<optimized out>)
>      at /usr/src/sys/kern/subr_kdb.c:556
> #10 0xffffffff8118fdf1 in vpanic (fmt=fmt@entry=0xffffffff82491d20 <str>
> "%s",
>      ap=ap@entry=0xfffffe0153ba28c0) at
> /usr/src/sys/kern/kern_shutdown.c:961
> #11 0xffffffff8118fba5 in panic (fmt=0xffffffff82491d20 <str> "%s")
>      at /usr/src/sys/kern/kern_shutdown.c:889
> #12 0xffffffff81ab9f8b in trap_fatal (frame=0xfffffe0153ba2d20, eva=7)
>      at /usr/src/sys/amd64/amd64/trap.c:950
> #13 0xffffffff81aba0bb in trap_pfault (frame=frame@entry=0xfffffe0153ba2d20,
>      usermode=false, signo=signo@entry=0x0, ucode=ucode@entry=0x0)
>      at /usr/src/sys/amd64/amd64/trap.c:761
> #14 0xffffffff81ab901c in trap (frame=0xfffffe0153ba2d20)
>      at /usr/src/sys/amd64/amd64/trap.c:440
> #15 <signal handler called>
> #16 0xffffffff80fafd67 in nfsrv_freelockowner (
>      stp=stp@entry=0xfffffe019cd75700, vp=vp@entry=0x0,
>      cansleep=cansleep@entry=0, p=p@entry=0xfffffe0153e2c740)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:1550
> #17 0xffffffff80fb4bfa in nfsrv_freestateid (nd=nd@entry=0xfffffe0153ba3710,
>      stateidp=stateidp@entry=0xfffffe0153ba2fc0,
> p=p@entry=0xfffffe0153e2c740)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdstate.c:6681
> #18 0xffffffff80fd5e3f in nfsrvd_freestateid (nd=0xfffffe0153ba3710,
>      isdgram=<optimized out>, vp=<optimized out>, exp=<optimized out>)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdserv.c:4764
> #19 0xffffffff80f98d35 in nfsrvd_compound (nd=0xfffffe0153ba3710, isdgram=0,
>      tag=<optimized out>, taglen=0, minorvers=<optimized out>)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdsocket.c:1338
> #20 nfsrvd_dorpc (nd=nd@entry=0xfffffe0153ba3710, isdgram=isdgram@entry=0,
>      tag=tag@entry=0xfffffe0153ba3670 "", taglen=taglen@entry=0,
>      minorvers=<optimized out>)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdsocket.c:633
> #21 0xffffffff80fbf588 in nfs_proc (xid=<optimized out>,
>      xprt=0xfffffe0199f0c600, nd=<optimized out>, rpp=<optimized out>)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:464
> #22 nfssvc_program (rqst=0xfffffe00f3c07000, xprt=0xfffffe0199f0c600)
>      at /usr/src/sys/fs/nfsserver/nfs_nfsdkrpc.c:348
> #23 0xffffffff8173fce6 in svc_executereq (rqstp=0xfffffe00f3c07000)
>      at /usr/src/sys/rpc/svc.c:1032
> #24 svc_run_internal (grp=grp@entry=0xfffffe00f04ec100,
>      ismaster=ismaster@entry=0) at /usr/src/sys/rpc/svc.c:1308
> #25 0xffffffff8174094b in svc_thread_start (arg=0x7,
>      arg@entry=0xfffffe00f04ec100) at /usr/src/sys/rpc/svc.c:1336
> #26 0xffffffff811137a3 in fork_exit (
>      callout=0xffffffff81740940 <svc_thread_start>, arg=0xfffffe00f04ec100,
>      frame=0xfffffe0153ba3f40) at /usr/src/sys/kern/kern_fork.c:1157
> #27 <signal handler called>
> #28 0x00003b4ff896f0da in ?? ()

Reply via email to