On Sun, Sep 22, 2013 at 11:19:16PM +0300, Konstantin Belousov wrote:
> On Sun, Sep 22, 2013 at 01:14:21PM -0700, Matthew Fleming wrote:
> > On Sun, Sep 22, 2013 at 12:23 PM, Konstantin Belousov 
> > <[email protected]>wrote:
> > 
> > > Author: kib
> > > Date: Sun Sep 22 19:23:48 2013
> > > New Revision: 255797
> > > URL: http://svnweb.freebsd.org/changeset/base/255797
> > >
> > > Log:
> > >   Increase the chance of the buffer write from the bufdaemon helper
> > >   context to succeed.  If the locked vnode which owns the buffer to be
> > >   written is shared locked, try the non-blocking upgrade of the lock to
> > >   exclusive.
> > >
> > >   PR:   kern/178997
> > >   Reported and tested by:       Klaus Weber <
> > > [email protected]>
> > >   Sponsored by: The FreeBSD Foundation
> > >   MFC after:    1 week
> > >   Approved by:  re (marius)
> > >
> > > Modified:
> > >   head/sys/kern/vfs_bio.c
> > >
> > > Modified: head/sys/kern/vfs_bio.c
> > >
> > > ==============================================================================
> > > --- head/sys/kern/vfs_bio.c     Sun Sep 22 19:15:24 2013        (r255796)
> > > +++ head/sys/kern/vfs_bio.c     Sun Sep 22 19:23:48 2013        (r255797)
> > > @@ -2624,6 +2624,8 @@ flushbufqueues(struct vnode *lvp, int ta
> > >         int hasdeps;
> > >         int flushed;
> > >         int queue;
> > > +       int error;
> > > +       bool unlock;
> > >
> > >         flushed = 0;
> > >         queue = QUEUE_DIRTY;
> > > @@ -2699,7 +2701,16 @@ flushbufqueues(struct vnode *lvp, int ta
> > >                         BUF_UNLOCK(bp);
> > >                         continue;
> > >                 }
> > > -               if (vn_lock(vp, LK_EXCLUSIVE | LK_NOWAIT | LK_CANRECURSE)
> > > == 0) {
> > > +               if (lvp == NULL) {
> > > +                       unlock = true;
> > > +                       error = vn_lock(vp, LK_EXCLUSIVE | LK_NOWAIT);
> > > +               } else {
> > > +                       ASSERT_VOP_LOCKED(vp, "getbuf");
> > > +                       unlock = false;
> > > +                       error = VOP_ISLOCKED(vp) == LK_EXCLUSIVE ? 0 :
> > > +                           vn_lock(vp, LK_UPGRADE | LK_NOWAIT);
> > >
> > 
> > I don't think this is quite right.
> > 
> > When the lock is held shared, and VOP_LOCK is implemented by lockmgr(9),
> > (i.e. all in-tree filesystems?), LK_UPGRADE may drop the lock, and not
> > reacquire it.  This would happen when the vnode is locked shared, the
> > upgrade fails (2 shared owners), then lockmgr(9) will try to lock EX, which
> > will also fail (still one shared owner).  The caller's lock is no longer
> > held.
> > 
> > Doesn't that scenario (LK_UPGRADE failing) cause problems both for the
> > caller (unexpected unlock) and for flushbufqueues(), which expects the
> > vnode lock to be held since lvp is non-NULL?
> 
> Does it ? If the lock is dropped, the code is indeed in trouble.
> Please note that LK_NOWAIT is specified for upgrade, and I believe
> that this causes lockmgr to return with EBUSY without dropping
> the lock.

Yes, you are right, I reverted the patch.  Thank you for noting this.

I am bitten by unreasonable behaviour of non-blocking upgrade once more.
It has a history.

Some time ago I proposed the following patch, which was turned down.
That time, I was able to work-around the case. For the bufdaemon helper,
I do not see any way to avoid this, except of sometimes locking the
reader vnode exclusive in anticipation of the too high dirty buffer
mark.

diff --git a/sys/kern/kern_lock.c b/sys/kern/kern_lock.c
index 74a5b19..2f2dbf6 100644
--- a/sys/kern/kern_lock.c
+++ b/sys/kern/kern_lock.c
@@ -694,6 +692,7 @@ __lockmgr_args(struct lock *lk, u_int flags, struct 
lock_object *ilk,
                }
                break;
        case LK_UPGRADE:
+       case LK_TRYUPGRADE:
                _lockmgr_assert(lk, KA_SLOCKED, file, line);
                v = lk->lk_lock;
                x = v & LK_ALL_WAITERS;
@@ -714,6 +713,17 @@ __lockmgr_args(struct lock *lk, u_int flags, struct 
lock_object *ilk,
                }
 
                /*
+                * In LK_TRYUPGRADE mode, do not drop the lock,
+                * returning EBUSY instead.
+                */
+               if (op == LK_TRYUPGRADE) {
+                       LOCK_LOG2(lk, "%s: %p failed the nowait upgrade",
+                           __func__, lk);
+                       error = EBUSY;
+                       break;
+               }
+
+               /*
                 * We have been unable to succeed in upgrading, so just
                 * give up the shared lock.
                 */
diff --git a/sys/sys/lockmgr.h b/sys/sys/lockmgr.h
index f525a06..7c51830 100644
--- a/sys/sys/lockmgr.h
+++ b/sys/sys/lockmgr.h
@@ -169,6 +168,7 @@ _lockmgr_args_rw(struct lock *lk, u_int flags, struct 
rwlock *ilk,
 #define        LK_RELEASE      0x100000
 #define        LK_SHARED       0x200000
 #define        LK_UPGRADE      0x400000
+#define        LK_TRYUPGRADE   0x800000
 
 #define        LK_TOTAL_MASK   (LK_INIT_MASK | LK_EATTR_MASK | LK_TYPE_MASK)
 

Attachment: pgplhfWBOjr41.pgp
Description: PGP signature

Reply via email to