A system I'm looking at continually runs into deadlocks when doing a
'make -j20 world'.  This appears to happen only when the file system
is mounted on a vinum volume, and it possibly also requires
softupdates, though I haven't confirmed this yet.  I initially thought
that this was a vinum problem, and though I can't exclude the
possibility, it's beginning to look less like it (since vinum doesn't
perform this kind of locking).  I currently suspect that vinum acts as
a catalyst by issuing disk multiple I/O requests in a very short space
of time.

Typically, the hanging processes' stack looks like:

  pid    proc    addr   uid  ppid  pgrp   flag wchan stat comm         wchan
65139 f3f6bd00 f3f6c000    0     0   556  004006 3  ld           inode f0bb3e00
Check your .gdbinit, it contains a y command
 frame 0 at 0xf3f6dc10: ebp f3f6dc2c, eip 0xf0134141 <tsleep+333>:      movb   
0xfd(%ebx),%al
 frame 1 at 0xf3f6dc2c: ebp f3f6dc50, eip 0xf012dd23 <acquire+139>:     movl   
%eax,%esi
 frame 2 at 0xf3f6dc50: ebp f3f6dc74, eip 0xf012dfe5 <debuglockmgr+637>:        
movl   %eax,0xfffffffc(%ebp)
 frame 3 at 0xf3f6dc74: ebp f3f6dc98, eip 0xf015457d <vop_stdlock+49>:  leave  
 frame 4 at 0xf3f6dc98: ebp f3f6dca4, eip 0xf01c8b35 <ufs_vnoperate+21>:        
leave  
 frame 5 at 0xf3f6dca4: ebp f3f6dccc, eip 0xf015d7dd <debug_vn_lock+85>:        
addl   $0x4,%esp
 frame 6 at 0xf3f6dccc: ebp f3f6dcf0, eip 0xf01571a9 <vget+113>:        movl   
%eax,%esi
 frame 7 at 0xf3f6dcf0: ebp f3f6dd3c, eip 0xf015300b <vfs_cache_lookup+375>:    
movl   %eax,%ebx
 frame 8 at 0xf3f6dd3c: ebp f3f6dd48, eip 0xf01c8b35 <ufs_vnoperate+21>:        
leave  
 frame 9 at 0xf3f6dd48: ebp f3f6dd94, eip 0xf01557ad <lookup+729>:      movl   
%eax,0xffffffcc(%ebp)
 frame 10 at 0xf3f6dd94: ebp f3f6ddf0, eip 0xf0155182 <namei+458>:      movl   
%eax,0xffffffc0(%ebp)
 frame 11 at 0xf3f6ddf0: ebp f3f6df54, eip 0xf012bef1 <execve+257>:     movl   
%eax,0xfffffebc(%ebp)
 frame 12 at 0xf3f6df54: ebp f3f6dfa4, eip 0xf01f07ab <syscall+295>:    addl   
$0x8,%esp

The lock manipulated by (debug)lockmgr shows:

$1 = {
  lk_interlock = {
    lock_data = 0
  }, 
  lk_flags = 3146240, 
  lk_sharecount = 1, 
  lk_waitcount = 2, 
  lk_exclusivecount = 0, 
  lk_prio = 8, 
  lk_wmesg = 0xf0216943 "inode", 
  lk_timo = 0, 
  lk_lockholder = -1, 
  lk_filename = 0xf020fdfb "../../kern/vfs_subr.c", 
  lk_lockername = 0xf020fa73 "vop_stdlock", 
  lk_lineno = 1268
}

The locker call was from vget().

The associated vnode shows that it's mounted on a vinum volume, which
in this build was used for the /usr/obj hierarchy.  In the example I'm
showing here, the file was /usr/obj/usr/src/tmp/usr/libexec/elf/mld.

I have lots more information available, but I don't want to put it all
in here.  If you think you recognize the problem, or if you have a
suggestion about how to approach it, I'd be pleased to hear from you,
either in this forum or privately.

Greg
-- 
See complete headers for address, home page and phone numbers
finger g...@lemis.com for PGP public key


To Unsubscribe: send mail to majord...@freebsd.org
with "unsubscribe freebsd-current" in the body of the message

Reply via email to