Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=5841



A customer using our 2.6.12 kernel with Lustre patches recently reported hangs
with autofs (for NFS mounts), and we traced the problem down to the patch
provided in this bug.

One user process tries to walk an autofs mount point and triggers action from
the automount daemon:

STACK TRACE FOR TASK: 0xe0000021c4030000 (sbatchd)

 0 schedule+0xc06 [0xa000000100543aa6]
 1 interruptible_sleep_on+0xcc [0xa000000100545eec]
 2 autofs_wait+0x2ac [0xa0000001002031ac]
 3 try_to_fill_dentry+0x30c [0xa000000100200e8c]
 4 autofs_revalidate+0x1fc [0xa00000010020123c]
 5 real_lookup+0xfc [0xa00000010014b53c]
 6 do_lookup+0x16c [0xa00000010014c00c]
 7 __link_path_walk+0x3cc [0xa00000010014ca4c]
 8 link_path_walk+0xbc [0xa00000010014f39c]
 9 path_lookup+0x176 [0xa00000010014f6f6]
10 __user_walk_it+0x7c [0xa0000001001500bc]
11 vfs_stat+0x6c [0xa00000010014178c]
12 sys_newstat+0x2c [0xa000000100141d2c]
13 ia64_ret_from_syscall [0xa00000010000b040]

The automount daemon tries to cross the same directory (autofs handles accesses
from automount daemons differently) and blocks on the dir->i_sem:

STACK TRACE FOR TASK: 0xe0000041fe8d0000 (automount)

 0 schedule+0xc06 [0xa000000100543aa6]
 1 __down+0x18c [0xa000000100542dcc]
 2 vfs_readdir+0x12c [0xa00000010015b30c]
 3 sys_getdents64+0xdc [0xa00000010015bbbc]
 4 ia64_ret_from_syscall [0xa00000010000b040]

At this point we are looking for a workaround in autofs, but suggestions for a
better fix in Lustre are welcome.

The offending patch is in the vfs_intent Lustre kernel patches for 2.6.12 and
sles10.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to