On Fri, Aug 07, 2009 at 04:48:37PM +0400, pluknet wrote: > 2009/8/7 Kostik Belousov <[email protected]>: > > On Fri, Aug 07, 2009 at 04:37:07PM +0400, pluknet wrote: > >> 2009/8/7 Kostik Belousov <[email protected]>: > >> > On Fri, Aug 07, 2009 at 03:37:11PM +0400, pluknet wrote: > >> >> This is on 7.2-R amd64. > >> >> > >> >> I'm curious if it might be due to glusterfs on it. > >> >> > >> >> Fatal trap 12: page fault while in kernel mode > >> >> cpuid = 3; apic id = 03 > >> >> fault virtual address = 0x0 > >> >> fault code = supervisor write data, page not present > >> >> instruction pointer = 0x8:0xffffffff805a52ba > >> >> stack pointer = 0x10:0xfffffffefc3474a0 > >> >> frame pointer = 0x10:0xfffffffefc347510 > >> >> code segment = base 0x0, limit 0xfffff, type > >> >> = DPL 0, pres 1, long 1, def32 0, gran 1 > >> >> processor eflags = interrupt enabled, resume, IOPL = 0 > >> >> current process = 35425 (find) > >> >> > >> >> db> bt > >> >> Tracing pid 35425 tid 100194 td 0xffffff003c165370 > >> >> vgonel() at vgonel+0x1aa > >> >> vnlru_free() at vnlru_free+0x36c > >> >> getnewvnode() at getnewvnode+0x281 > >> >> ffs_vgetf() at ffs_vgetf+0xdf > >> >> ufs_lookup() at ufs_lookup+0x2dd > >> >> vfs_cache_lookup() at vfs_cache_lookup+0xf3 > >> >> VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x40 > >> >> lookup() at lookup+0x598 > >> >> namei() at namei+0x33e > >> >> kern_lstat() at kern_lstat+0x5e > >> >> lstat() at lstat+0x2a > >> >> syscall() at syscall+0x256 > >> >> Xfast_syscall() at Xfast_syscall+0xab > >> >> --- syscall (190, FreeBSD ELF64, lstat), rip = 0x80071063c, rsp = > >> >> 0x7fffffffea48, rbp = 0x800a06910 --- > >> > > >> > Did you got the vmcore ? If yes, please find the value for vgonel() > >> > argument, vp, and print the vnode content. > >> > >> I didn't. Same problem as in my another mail. :( > >> > >> > > >> > Regardless of this, look up the source line for vgonel+0x1aa. > >> > > >> > >> I could resolve only address which belongs to instruction pointer > >> = 0x8:0xffffffff805a52ba > >> (eh, I don't know if I should sum these numbers, so I did this for both > >> cases): > >> > >> dev2# addr2line -e /boot/kernel/kernel.symbols 0xffffffff805a52ba > >> /usr/src/sys/kern/vfs_subr.c:979 > >> delmntque(): TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes); > >> > >> dev2# addr2line -e /boot/kernel/kernel.symbols 0xffffffff805a52c2 > >> /usr/src/sys/kern/vfs_subr.c:981 > >> delmntque(): MNT_REL(mp); > > > > load kernel.debug into gdb, and then do "list *(vgonel+0x1aa)" > > > > Ah, of course. Sorry. > > (gdb) list *(vgonel+0x1aa) > 0xffffffff805a52ba is in vgonel (/usr/src/sys/kern/vfs_subr.c:979). > 974 return; > 975 MNT_ILOCK(mp); > 976 vp->v_mount = NULL; > 977 VNASSERT(mp->mnt_nvnodelistsize > 0, vp, > 978 ("bad mount point vnode list size")); > 979 TAILQ_REMOVE(&mp->mnt_nvnodelist, vp, v_nmntvnodes); > 980 mp->mnt_nvnodelistsize--; > 981 MNT_REL(mp); > 982 MNT_IUNLOCK(mp); > 983 }
If you can reproduce the issue at will, please try to do reproduce it on the kernel with INVARIANTS and possibly DEBUG_VFS_LOCKS on.
pgpYqdB5CucDA.pgp
Description: PGP signature
