On Mon, Oct 09, 2017 at 22:21 +0000, Alexander Bluhm wrote: > Hi, > > we sometimes see a panic "unmount: dangling vnode" when rebooting a 6.1 > system with softdep. > > I have hacked some diagnostic panics until I got these traces from the > reboot and update process. > > Reboot: > sleep_finish() at sleep_finish+0xb1 > tsleep() at tsleep+0x154 > biowait() at biowait+0x46 > bwrite() at bwrite+0x10d > ffs_update() at ffs_update+0x2bd > VOP_FSYNC() at VOP_FSYNC+0x3c > ffs_flushfiles() at ffs_flushfiles+0xb9 > softdep_flushfiles() at softdep_flushfiles+0x4e > ffs_unmount() at ffs_unmount+0x49 > dounmount_leaf() at dounmount_leaf+0x8b > dounmount() at dounmount+0xb2 > vfs_unmountall() at vfs_unmountall+0x72 > vfs_shutdown() at vfs_shutdown+0x79 > boot() at boot+0x144 > reboot() at reboot+0x30 > sys_reboot() at sys_reboot+0x5e > syscall() at syscall+0x21f > > Update: > *115878 74431 0 0x14000 0x200 0 update > Debugger() at Debugger+0x9 > panic() at panic+0xfe > insmntque() at insmntque+0x86 > getnewvnode() at getnewvnode+0x192 > ffs_vget() at ffs_vget+0x8b > handle_workitem_remove() at handle_workitem_remove+0x4c > process_worklist_item() at process_worklist_item+0xf5 > softdep_process_worklist() at softdep_process_worklist+0x169 > sched_sync() at sched_sync+0xfb > > At reboot all vnodes are flushed, but when it sleeps, the update > process has a chance to create new dirty vnodes. Resolving soft > dependencies adds vnodes to the dirty list. > > In softdep_flushfiles() vnodes and softdep are flushed in a loop. > But if they sleep, it is not guaranteed that all vnodes have been > flushed when the softdep worklist flush reports that nothing has > been done. > > My solution is to do a final vnode flush after the softdep worklist > has been flushed. Then the dirty list is empty and the final check in > dounmount_leaf() does not panic. > > ok? > > bluhm >
Makes sense to me. FreeBSD does something similar: https://svnweb.freebsd.org/base/head/sys/ufs/ffs/ffs_softdep.c?revision=324039&view=markup#l1920 > Index: ufs/ffs/ffs_softdep.c > =================================================================== > RCS file: /data/mirror/openbsd/cvs/src/sys/ufs/ffs/ffs_softdep.c,v > retrieving revision 1.135 > diff -u -p -r1.135 ffs_softdep.c > --- ufs/ffs/ffs_softdep.c 7 Nov 2016 00:26:33 -0000 1.135 > +++ ufs/ffs/ffs_softdep.c 9 Oct 2017 22:19:39 -0000 > @@ -904,6 +904,14 @@ softdep_flushfiles(struct mount *oldmnt, > break; > } > /* > + * If the reboot process sleeps during the loop, the update > + * process may call softdep_process_worklist() and create > + * new dirty vnodes at the mount point. Call ffs_flushfiles() > + * again after the loop has flushed all soft dependencies. > + */ > + if (error == 0) > + error = ffs_flushfiles(oldmnt, flags, p); > + /* > * If we are unmounting then it is an error to fail. If we > * are simply trying to downgrade to read-only, then filesystem > * activity can keep us busy forever, so we just fail with EBUSY. >