On 23 Nov 2014, at 17:47, Emmanuel Dreyfus <[email protected]> wrote: > Hi > > I ran into this strange bug with glusterfs NFS server, which is possible > because it allows the mounted filesystem root vnode to be VLNK (NetBSD's > native mountd prevents this situation and therefore the bug does not > happen with our native NFS server): > > bacasel# mount > bacasel.net.espci.fr:/patchy/symlink1 on /mnt/nfs/0 type nfs > > bacasel# ls -l /mnt/nfs > lrwxrwxrwx 1 root wheel 4 Nov 23 10:03 0 -> dir1 > > That is possible because on the exported filesystem, symlink1 is a > symlink to dir1. > > It looks funny and harmless, at least until one try to unmount while the > NFS server is down. I do this using the most reliable way: umount -f -R > /mnt/nfs/0 > > > umount(8) will quickly call unmount(2), passing the /mnt/nfs/0 path > without trying anything fancy. But at the beginning of sys_unmount() in > the kernel, we can find: > > NDINIT(&nd, LOOKUP, FOLLOW | LOCKLEAF | TRYEMULROOT, pb); > if ((error = namei(&nd)) != 0) { > pathbuf_destroy(pb); > return error; > } > > The FOLLOW flag will cause the symlink to be resolved before > sys_unmount() can proceed. Since the NFS server is down, the namei call > will never return, and umount(8) get stuck in kernel with this > backtrace: > sleepq_block > kpause > nfs_reconnect > nfs_request > nfs_readlinkrpc > nfs_readlink > VOP_READLINK > namei_tryemulroot > namei > sys_unmount > syscall > > This is annoying because umount -f should really unmount. Moreover, > stuck process will crop on the system because umount(8) holds a vnode > lock forever. > > I see two way of fixing this: > 1) remove FOLLOW flag in sys_unmount(). After all, unless the -R flag is > given, umount(8) should have resolved the patch before calling > unmount(2).
Did you try -- last time I tried a forced unmount with NFS server down it didn't work even with root being a directory because the namei() call would hang in VOP_LOOKUP(). Does it work these days? > 2) Desfine a new MNT_RAW mount flag. If umount(8) is called with -R, > pass that flag to unmount(8), and in sys_unmount(), do not use FOLLOW if > MNT_RAW is set. -- J. Hannken-Illjes - [email protected] - TU Braunschweig (Germany)
