On Mon, Oct 09, 2017 at 22:21 +0000, Alexander Bluhm wrote:
> Hi,
> 
> we sometimes see a panic "unmount: dangling vnode" when rebooting a 6.1
> system with softdep.
> 
> I have hacked some diagnostic panics until I got these traces from the
> reboot and update process.
> 
> Reboot:
> sleep_finish() at sleep_finish+0xb1
> tsleep() at tsleep+0x154
> biowait() at biowait+0x46
> bwrite() at bwrite+0x10d
> ffs_update() at ffs_update+0x2bd
> VOP_FSYNC() at VOP_FSYNC+0x3c
> ffs_flushfiles() at ffs_flushfiles+0xb9
> softdep_flushfiles() at softdep_flushfiles+0x4e
> ffs_unmount() at ffs_unmount+0x49
> dounmount_leaf() at dounmount_leaf+0x8b
> dounmount() at dounmount+0xb2
> vfs_unmountall() at vfs_unmountall+0x72
> vfs_shutdown() at vfs_shutdown+0x79
> boot() at boot+0x144
> reboot() at reboot+0x30
> sys_reboot() at sys_reboot+0x5e
> syscall() at syscall+0x21f
> 
> Update:
> *115878  74431      0     0x14000      0x200    0  update
> Debugger() at Debugger+0x9
> panic() at panic+0xfe
> insmntque() at insmntque+0x86
> getnewvnode() at getnewvnode+0x192
> ffs_vget() at ffs_vget+0x8b
> handle_workitem_remove() at handle_workitem_remove+0x4c
> process_worklist_item() at process_worklist_item+0xf5
> softdep_process_worklist() at softdep_process_worklist+0x169
> sched_sync() at sched_sync+0xfb
> 
> At reboot all vnodes are flushed, but when it sleeps, the update
> process has a chance to create new dirty vnodes.  Resolving soft
> dependencies adds vnodes to the dirty list.
> 
> In softdep_flushfiles() vnodes and softdep are flushed in a loop.
> But if they sleep, it is not guaranteed that all vnodes have been
> flushed when the softdep worklist flush reports that nothing has
> been done.
> 
> My solution is to do a final vnode flush after the softdep worklist
> has been flushed.  Then the dirty list is empty and the final check in
> dounmount_leaf() does not panic.
> 
> ok?
> 
> bluhm
>

Makes sense to me.  FreeBSD does something similar:
https://svnweb.freebsd.org/base/head/sys/ufs/ffs/ffs_softdep.c?revision=324039&view=markup#l1920

> Index: ufs/ffs/ffs_softdep.c
> ===================================================================
> RCS file: /data/mirror/openbsd/cvs/src/sys/ufs/ffs/ffs_softdep.c,v
> retrieving revision 1.135
> diff -u -p -r1.135 ffs_softdep.c
> --- ufs/ffs/ffs_softdep.c     7 Nov 2016 00:26:33 -0000       1.135
> +++ ufs/ffs/ffs_softdep.c     9 Oct 2017 22:19:39 -0000
> @@ -904,6 +904,14 @@ softdep_flushfiles(struct mount *oldmnt,
>                       break;
>       }
>       /*
> +      * If the reboot process sleeps during the loop, the update
> +      * process may call softdep_process_worklist() and create
> +      * new dirty vnodes at the mount point.  Call ffs_flushfiles()
> +      * again after the loop has flushed all soft dependencies.
> +      */
> +     if (error == 0)
> +             error = ffs_flushfiles(oldmnt, flags, p);
> +     /*
>        * If we are unmounting then it is an error to fail. If we
>        * are simply trying to downgrade to read-only, then filesystem
>        * activity can keep us busy forever, so we just fail with EBUSY.
> 

Reply via email to