On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote: > Kostik Belousov wrote: > >On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote: > > > >>Hi, > >>the attached lor.txt contains LOR I got this yesterday. It is FreeBSD 6.1 > >>with relatively recent kernel, from last week or so. > >> > >>-- > >>VH > > > > > >>+lock order reversal: > >>+ 1st 0xc537f300 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547 > >>+ 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @ > >>/usr/src/sys/ufs/ufs/ufs_vnops.c:138 > >>+KDB: stack backtrace: > >>+kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at > >>kdb_backtrace+0x2f > >>+witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at > >>witness_checkorder+0x5fe > >>+_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at > >>_mtx_lock_flags+0x32 > >>+ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at > >>ufs_itimes+0x6c > >>+ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at > >>ufs_getattr+0x20 > >>+VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at > >>VOP_GETATTR_APV+0x3a > >>+filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75 > >>+knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75 > >>+VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at > >>VOP_WRITE_APV+0x148 > >>+vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at vn_write+0x201 > >>+dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,ffffffff,...) at > >>dofilewrite+0x84 > >>+kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65 > >>+write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f > >>+syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295 > >>+Xint0x80_syscall() at Xint0x80_syscall+0x1f > >>+--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp = > >>0xbfbfea1c, ebp = 0xbfbfea48 --- > > > > > >Thank you for the report. The LOR is caused by my commit into > >sys/ufs/ufs/ufs_vnops.c, rev. 1.280. > > Is the mount lock really required, if all we're doing is a single read of a > single word (mnt_kern_flags) (v_mount should be read-only for the whole > lifetime of the vnode, I believe)? After all, reads of a single word are > atomic on all our supported architectures. > The only situation I see where there MIGHT be problems are forced unmounts, > but I think there are bigger issues with those. > Sorry for noticing this email only now.
The problem is real with snapshotting. Ignoring
MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
inactivation time. This was the big trouble with nfsd and snapshots. As
such, I think that precise value of mmnt_kern_flag is critical there,
and mount interlock is needed.
Practically speaking, I agree with claim that reading of m_k_f is
surrounded by enough locked operations that would make sure that
the read value is not stale. But there is no such guarantee on
future/non-i386 arches, isn't it ?
As a side note, mount interlock scope could be reduced there.
Index: ufs/ufs/ufs_vnops.c
===================================================================
RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v
retrieving revision 1.283
diff -u -r1.283 ufs_vnops.c
--- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 -0000 1.283
+++ ufs/ufs/ufs_vnops.c 12 Dec 2006 10:18:04 -0000
@@ -133,19 +134,19 @@
{
struct inode *ip;
struct timespec ts;
- int mnt_locked;
ip = VTOI(vp);
- mnt_locked = 0;
if ((vp->v_mount->mnt_flag & MNT_RDONLY) != 0) {
VI_LOCK(vp);
goto out;
}
MNT_ILOCK(vp->v_mount); /* For reading of mnt_kern_flags. */
- mnt_locked = 1;
VI_LOCK(vp);
- if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0)
- goto out_unl;
+ if ((ip->i_flag & (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) {
+ MNT_IUNLOCK(vp->v_mount);
+ VI_UNLOCK(vp);
+ return;
+ }
if ((vp->v_type == VBLK || vp->v_type == VCHR) && !DOINGSOFTDEP(vp))
ip->i_flag |= IN_LAZYMOD;
@@ -155,6 +156,7 @@
ip->i_flag |= IN_MODIFIED;
else if (ip->i_flag & IN_ACCESS)
ip->i_flag |= IN_LAZYACCESS;
+ MNT_IUNLOCK(vp->v_mount);
vfs_timestamp(&ts);
if (ip->i_flag & IN_ACCESS) {
DIP_SET(ip, i_atime, ts.tv_sec);
@@ -172,10 +174,7 @@
out:
ip->i_flag &= ~(IN_ACCESS | IN_CHANGE | IN_UPDATE);
- out_unl:
VI_UNLOCK(vp);
- if (mnt_locked)
- MNT_IUNLOCK(vp->v_mount);
}
/*
pgpkA3SWblcGR.pgp
Description: PGP signature
