On Tue, Oct 10, 2017 at 10:15:48AM +0200, Martin Pieuchot wrote:
> Hello Mateusz,
>
> On 09/10/17(Mon) 21:43, Mateusz Guzik wrote:
> > I was looking at rw lock code out of curiosity and noticed you always do
> > membar_enter which on MP-enabled amd64 kernel translates to mfence.
> > This makes the entire business a little bit slower.
> >
> > Interestingly you already have relevant macros for amd64:
> > #define membar_enter_after_atomic() __membar("")
> > #define membar_exit_before_atomic() __membar("")
>
> If you look at the history, you'll see that theses macro didn't exist
> when membar_* where added to rw locks.
>

I know. Addition of such a macro sounds like an accelent opportunity to
examine existing users. I figured rw locks were ommitted by accident
as the original posting explicitly mentions mutexes.

https://marc.info/?l=openbsd-tech&m=149555411827749&w=2

> > And there is even a default variant for archs which don't provide one.
> > I guess the switch should be easy.
>
> Then please do it :)  At lot of stuff is easy on OpenBSD but we are not
> enough.
>
> Do it, test it, explain it, get oks and commit it 8)
>

As noted earlier the patch is trivial. I have no easy means to even
compile-test it though as I'm not using OpenBSD. Only had a look out of
curiosity.

Nonetheless, here is the diff which can be split in 2. One part deals
with barriers, another one cleans up rw_status.

This should not result in binary change on any arch apart from i386 and
amd64. On these 2 if there is a breakage, it's a braino of some sort.
The change in principle is definitely fine.

So:

Utilize membar_*{before,after}_atomic in rw locks.

On architectures whose atomic operations provide relevant barriers there
is no need for additional membar. In particular this is the case on
amd64 and i386.

Read from the lock only once in rw_status. The lock word is marked as
volatile which forces re-read on each access. There is no correctness
issue here, but the assembly is worse than it needs to be and in case
of contention this adds avoidable cache traffic.

diff --git a/sys/kern/kern_rwlock.c b/sys/kern/kern_rwlock.c
index e2c3cae..5551835 100644
--- a/sys/kern/kern_rwlock.c
+++ b/sys/kern/kern_rwlock.c
@@ -96,7 +96,7 @@ _rw_enter_read(struct rwlock *rwl LOCK_FL_VARS)
            rw_cas(&rwl->rwl_owner, owner, owner + RWLOCK_READ_INCR)))
                _rw_enter(rwl, RW_READ LOCK_FL_ARGS);
        else {
-               membar_enter();
+               membar_enter_after_atomic();
                WITNESS_CHECKORDER(&rwl->rwl_lock_obj, LOP_NEWORDER, file,
line,
                    NULL);
                WITNESS_LOCK(&rwl->rwl_lock_obj, 0, file, line);
@@ -112,7 +112,7 @@ _rw_enter_write(struct rwlock *rwl LOCK_FL_VARS)
            RW_PROC(p) | RWLOCK_WRLOCK)))
                _rw_enter(rwl, RW_WRITE LOCK_FL_ARGS);
        else {
-               membar_enter();
+               membar_enter_after_atomic();
                WITNESS_CHECKORDER(&rwl->rwl_lock_obj,
                    LOP_EXCLUSIVE | LOP_NEWORDER, file, line, NULL);
                WITNESS_LOCK(&rwl->rwl_lock_obj, LOP_EXCLUSIVE, file, line);
@@ -126,7 +126,7 @@ _rw_exit_read(struct rwlock *rwl LOCK_FL_VARS)

        rw_assert_rdlock(rwl);

-       membar_exit();
+       membar_exit_before_atomic();
        if (__predict_false((owner & RWLOCK_WAIT) ||
            rw_cas(&rwl->rwl_owner, owner, owner - RWLOCK_READ_INCR)))
                _rw_exit(rwl LOCK_FL_ARGS);
@@ -141,7 +141,7 @@ _rw_exit_write(struct rwlock *rwl LOCK_FL_VARS)

        rw_assert_wrlock(rwl);

-       membar_exit();
+       membar_exit_before_atomic();
        if (__predict_false((owner & RWLOCK_WAIT) ||
            rw_cas(&rwl->rwl_owner, owner, 0)))
                _rw_exit(rwl LOCK_FL_ARGS);
@@ -261,7 +261,7 @@ retry:

        if (__predict_false(rw_cas(&rwl->rwl_owner, o, o + inc)))
                goto retry;
-       membar_enter();
+       membar_enter_after_atomic();

        /*
         * If old lock had RWLOCK_WAIT and RWLOCK_WRLOCK set, it means we
@@ -295,7 +295,7 @@ _rw_exit(struct rwlock *rwl LOCK_FL_VARS)
        WITNESS_UNLOCK(&rwl->rwl_lock_obj, wrlock ? LOP_EXCLUSIVE : 0,
            file, line);

-       membar_exit();
+       membar_exit_before_atomic();
        do {
                owner = rwl->rwl_owner;
                if (wrlock)
@@ -312,13 +312,15 @@ _rw_exit(struct rwlock *rwl LOCK_FL_VARS)
 int
 rw_status(struct rwlock *rwl)
 {
-       if (rwl->rwl_owner & RWLOCK_WRLOCK) {
-               if (RW_PROC(curproc) == RW_PROC(rwl->rwl_owner))
+       unsigned long owner = rwl->rwl_owner;
+
+       if (owner & RWLOCK_WRLOCK) {
+               if (RW_PROC(curproc) == RW_PROC(owner))
                        return RW_WRITE;
                else
                        return RW_WRITE_OTHER;
        }
-       if (rwl->rwl_owner)
+       if (owner)
                return RW_READ;
        return (0);
 }

-- 
Mateusz Guzik <mjguzik gmail.com>

Reply via email to