https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229167

--- Comment #37 from Wei Hu <w...@microsoft.com> ---
I hit this bug on FreeBSD 12.1 image in Azure with Mellanox CX3 VF. Looks the
root file system is not available at the time mlx4 driver is trying to load
mlx4en kernel module. Adding one second sleep in mlx4_request_modules makes the
problem go away. But it doesn't look like a fix to me. At least namei() or
vrefact() should check if the vnode is NULL to avoid the panic. 

Here is the detailed troubleshooting in debugger I did when the crash happened.

The panic on console:
----------------------
pci1: <PCI bus> on pcib1
mlx4_core0: <mlx4_core> at device 2.0 on pci1
<6>mlx4_core: Mellanox ConnectX core driver v3.5.1 (April 2019)
mlx4_core: Initializing mlx4_core
mlx4_core0: Detected virtual function - running in slave mode
mlx4_core0: Sending reset
mlx4_core0: Sending vhcr0
mlx4_core0: HCA minimum page size:512
mlx4_core0: Timestamping is not supported in slave mode
mlx4_en mlx4_core0: Activating port:1
mlxen0: Ethernet address: 00:0d:3a:e8:16:18
<4>mlx4_en: mlx4_core0: Port 1: Using 4 TX rings
mlxen0: link state changed to DOWN
<4>mlx4_en: mlx4_core0: Port 1: Using 4 RX rings
<4>mlx4_en: mlxen0: Using 4 TX rings
hn0: link state changed to DOWN
<4>mlx4_en: mlxen0: Using 4 RX rings
<4>mlx4_en: mlxen0: Initializing port
mlx4_core0: About to load mlx4_en


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x1d8   <- 0x1d8 is the offset of (struct vnode
*)->v_type 
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80cb5c34
stack pointer           = 0x28:0xfffffe00004f4960
frame pointer           = 0x28:0xfffffe00004f4960
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (vmbusdev)
trap number             = 12
panic: page fault
cpuid = 2
time = 1599838711
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00004f4610
vpanic() at vpanic+0x19d/frame 0xfffffe00004f4660
panic() at panic+0x43/frame 0xfffffe00004f46c0
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe00004f4720
trap_pfault() at trap_pfault+0x49/frame 0xfffffe00004f4780
trap() at trap+0x29f/frame 0xfffffe00004f4890
calltrap() at calltrap+0x8/frame 0xfffffe00004f4890
--- trap 0xc, rip = 0xffffffff80cb5c34, rsp = 0xfffffe00004f4960, rbp =
0xfffffe00004f4960 ---
vrefact() at vrefact+0x4/frame 0xfffffe00004f4960
namei() at namei+0x172/frame 0xfffffe00004f4a20
vn_open_cred() at vn_open_cred+0x221/frame 0xfffffe00004f4b70
linker_load_module() at linker_load_module+0x480/frame 0xfffffe00004f4e90
kern_kldload() at kern_kldload+0xc3/frame 0xfffffe00004f4ee0
mlx4_request_modules() at mlx4_request_modules+0xc2/frame 0xfffffe00004f4fa0
mlx4_load_one() at mlx4_load_one+0x349c/frame 0xfffffe00004f5660
mlx4_init_one() at mlx4_init_one+0x3f0/frame 0xfffffe00004f56b0
linux_pci_attach() at linux_pci_attach+0x432/frame 0xfffffe00004f5710
device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5760
bus_generic_attach() at bus_generic_attach+0x5c/frame 0xfffffe00004f5790
pci_attach() at pci_attach+0xd5/frame 0xfffffe00004f57d0
device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5820
bus_generic_attach() at bus_generic_attach+0x5c/frame 0xfffffe00004f5850
vmbus_pcib_attach() at vmbus_pcib_attach+0x75e/frame 0xfffffe00004f5930
device_attach() at device_attach+0x3e1/frame 0xfffffe00004f5980
device_probe_and_attach() at device_probe_and_attach+0x42/frame
0xfffffe00004f59b0
vmbus_add_child() at vmbus_add_child+0x7b/frame 0xfffffe00004f59e0
taskqueue_run_locked() at taskqueue_run_locked+0x154/frame 0xfffffe00004f5a40
taskqueue_thread_loop() at taskqueue_thread_loop+0x98/frame 0xfffffe00004f5a70
fork_exit() at fork_exit+0x83/frame 0xfffffe00004f5ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004f5ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 0 tid 100080 ]
Stopped at      kdb_enter+0x3b: movq    $0,kdb_why

db> x/i vrefact,10
vrefact:        pushq   %rbp
vrefact+0x1:    movq    %rsp,%rbp
vrefact+0x4:    cmpl    $0x4,ll+0x1b7(%rdi)  <-- if (__predict_false(vp->v_type
== VCHR)) in vrefact()
vrefact+0xb:    jz      vrefact+0x1f
vrefact+0xd:    lock addl       $0x1,ll+0x19b(%rdi)
vrefact+0x15:   lock addl       $0x1,ll+0x19f(%rdi)
vrefact+0x1d:   popq    %rbp
vrefact+0x1e:   ret

db> x/i namei+0x160,10
namei+0x160:    call    _sx_slock_int
namei+0x165:    movq    0x10(%r13),%rdi
namei+0x169:    movq    %rdi,ll+0x7(%rbx)
namei+0x16d:    call    vrefact          <--- Place in namei() calling
vrefact()
namei+0x172:    movq    0x18(%r13),%rax
namei+0x176:    movq    %rax,ll+0xf(%rbx)
namei+0x17a:    movq    ll+0x77(%rbx),%rax

This is the code in namei():

        /*
         * Get starting point for the translation.
         */
        FILEDESC_SLOCK(fdp);
        ndp->ni_rootdir = fdp->fd_rdir;
        vrefact(ndp->ni_rootdir);         <--- here
        ndp->ni_topdir = fdp->fd_jdir;

And fdp is (struct filedesc *) and got assigned earlier to the current proc's
p_fd:

        p = td->td_proc;
        ...
        fdp = p->p_fd;


db> show thread
Thread 100080 at 0xfffff80004a95000:
 proc (pid 0): 0xffffffff81ff2060   <--- pointer to proc
 name: vmbusdev
 stack: 0xfffffe00004f2000-0xfffffe00004f5fff
 flags: 0x4  pflags: 0x200000
 state: RUNNING (CPU 2)
 priority: 8
 container lock: sched lock 2 (0xffffffff81eb3540)
 last voluntary switch: 50 ms ago
db> x/gx 0xffffffff81ff2060,20 (struct proc)
proc0:  0                               fffff80003609a60               
ffffffff81ff25a0
proc0+0x18:     fffff80009866010                ffffffff81332c23
proc0+0x28:     30000                           0
proc0+0x38:     0                               fffff8000308ad00
proc0+0x48:     fffff800035168a0 (p_fd)               0
proc0+0x58:     fffff80003084e00                fffff8000308ac00

db> x/gx 0xfffff800035168a0, 10 (struct filedesc)
0xfffff800035168a0:     fffff80003516920                0
0xfffff800035168b0:     0  (fd_rdir)                    0
0xfffff800035168c0:     fffff80003516ce8                ffffffff
0xfffff800035168d0:     100000012                       1
0xfffff800035168e0:     ffffffff812c4e35                2330000
0xfffff800035168f0:     0                               21
0xfffff80003516900:     0                               0
0xfffff80003516910:     0                               0

So at the moment the fd_rdir (root directory) is still NULL.

-- 
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
freebsd-virtualization@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-virtualization
To unsubscribe, send any mail to 
"freebsd-virtualization-unsubscr...@freebsd.org"

Reply via email to