Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On 01/20/2014 10:18 AM, Peter Zijlstra wrote: On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* -* Make sure that none of the critical section will be leaked out. +* If the writer field is atomic, it can be cleared directly. +* Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock->cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock->cnts.writer)) + smp_store_release(>cnts.writer, 0); + else + atomic_sub(_QW_LOCKED,>cnts.rwa); } If we're a writer, read-count must be zero. The only way for that not to be zero is a concurrent read-(try)lock. If you move all the read-(try)locks over to cmpxchg() you can avoid this afaict: That is not true. A reader may transiently set the reader count to a non-zero value in the fast path. Also, a reader in interrupt context will force a non-zero reader count to take the read lock as soon as the writer is done. static inline void queue_read_trylock(struct qrwlock *lock) { union qrwcnts cnts cnts = ACCESS_ONCE(lock->cnts); if (!cnts.writer) { if (cmpxchg(>cnts.rwc, cnts.rwc, cnts.rwc + _QR_BIAS) == cnts.rwc) return 1; } return 0; } static inline void queue_read_lock(struct qrwlock *lock) { if (!queue_read_trylock(lock)) queue_read_lock_slowpath(); // XXX do not assume extra _QR_BIAS } At which point you have the guarantee that read-count == 0, and you can write: static inline void queue_write_unlock(struct qrwlock *lock) { smp_store_release(>cnts.rwc, 0); } No? The current code is optimized for the reader-heavy case. So I used xadd for incrementing reader count to reduce the chance of retry due to concurrent reader count updates. The downside is the need to back out if a writer is here. I can change the logic to use only cmpxchg for readers, but I don't see a compelling reason to do so. -Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 21, 2014 at 10:45:42AM -0500, Waiman Long wrote: > I can change the logic to use only cmpxchg for readers, but I don't see a > compelling reason to do so. The fact that you can then use smp_store_release() is fairly compelling methinks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 21, 2014 at 10:45:42AM -0500, Waiman Long wrote: I can change the logic to use only cmpxchg for readers, but I don't see a compelling reason to do so. The fact that you can then use smp_store_release() is fairly compelling methinks. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On 01/20/2014 10:18 AM, Peter Zijlstra wrote: On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Longwaiman.l...@hp.com --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* -* Make sure that none of the critical section will be leaked out. +* If the writer field is atomic, it can be cleared directly. +* Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock-cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock-cnts.writer)) + smp_store_release(lock-cnts.writer, 0); + else + atomic_sub(_QW_LOCKED,lock-cnts.rwa); } If we're a writer, read-count must be zero. The only way for that not to be zero is a concurrent read-(try)lock. If you move all the read-(try)locks over to cmpxchg() you can avoid this afaict: That is not true. A reader may transiently set the reader count to a non-zero value in the fast path. Also, a reader in interrupt context will force a non-zero reader count to take the read lock as soon as the writer is done. static inline void queue_read_trylock(struct qrwlock *lock) { union qrwcnts cnts cnts = ACCESS_ONCE(lock-cnts); if (!cnts.writer) { if (cmpxchg(lock-cnts.rwc, cnts.rwc, cnts.rwc + _QR_BIAS) == cnts.rwc) return 1; } return 0; } static inline void queue_read_lock(struct qrwlock *lock) { if (!queue_read_trylock(lock)) queue_read_lock_slowpath(); // XXX do not assume extra _QR_BIAS } At which point you have the guarantee that read-count == 0, and you can write: static inline void queue_write_unlock(struct qrwlock *lock) { smp_store_release(lock-cnts.rwc, 0); } No? The current code is optimized for the reader-heavy case. So I used xadd for incrementing reader count to reduce the chance of retry due to concurrent reader count updates. The downside is the need to back out if a writer is here. I can change the logic to use only cmpxchg for readers, but I don't see a compelling reason to do so. -Longman -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: > This patch modifies the queue_write_unlock() function to use the new > smp_store_release() function (currently in tip). It also removes the > temporary implementation of smp_load_acquire() and smp_store_release() > function in qrwlock.c. > > This patch will use atomic subtraction instead if the writer field is > not atomic. > > Signed-off-by: Waiman Long > --- > include/asm-generic/qrwlock.h | 10 ++ > kernel/locking/qrwlock.c | 34 -- > 2 files changed, 6 insertions(+), 38 deletions(-) > > diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h > index 5abb6ca..68f488b 100644 > --- a/include/asm-generic/qrwlock.h > +++ b/include/asm-generic/qrwlock.h > @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock > *lock) > static inline void queue_write_unlock(struct qrwlock *lock) > { > /* > - * Make sure that none of the critical section will be leaked out. > + * If the writer field is atomic, it can be cleared directly. > + * Otherwise, an atomic subtraction will be used to clear it. >*/ > - smp_mb__before_clear_bit(); > - ACCESS_ONCE(lock->cnts.writer) = 0; > - smp_mb__after_clear_bit(); > + if (__native_word(lock->cnts.writer)) > + smp_store_release(>cnts.writer, 0); > + else > + atomic_sub(_QW_LOCKED, >cnts.rwa); > } If we're a writer, read-count must be zero. The only way for that not to be zero is a concurrent read-(try)lock. If you move all the read-(try)locks over to cmpxchg() you can avoid this afaict: static inline void queue_read_trylock(struct qrwlock *lock) { union qrwcnts cnts cnts = ACCESS_ONCE(lock->cnts); if (!cnts.writer) { if (cmpxchg(>cnts.rwc, cnts.rwc, cnts.rwc + _QR_BIAS) == cnts.rwc) return 1; } return 0; } static inline void queue_read_lock(struct qrwlock *lock) { if (!queue_read_trylock(lock)) queue_read_lock_slowpath(); // XXX do not assume extra _QR_BIAS } At which point you have the guarantee that read-count == 0, and you can write: static inline void queue_write_unlock(struct qrwlock *lock) { smp_store_release(>cnts.rwc, 0); } No? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* - * Make sure that none of the critical section will be leaked out. + * If the writer field is atomic, it can be cleared directly. + * Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock-cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock-cnts.writer)) + smp_store_release(lock-cnts.writer, 0); + else + atomic_sub(_QW_LOCKED, lock-cnts.rwa); } If we're a writer, read-count must be zero. The only way for that not to be zero is a concurrent read-(try)lock. If you move all the read-(try)locks over to cmpxchg() you can avoid this afaict: static inline void queue_read_trylock(struct qrwlock *lock) { union qrwcnts cnts cnts = ACCESS_ONCE(lock-cnts); if (!cnts.writer) { if (cmpxchg(lock-cnts.rwc, cnts.rwc, cnts.rwc + _QR_BIAS) == cnts.rwc) return 1; } return 0; } static inline void queue_read_lock(struct qrwlock *lock) { if (!queue_read_trylock(lock)) queue_read_lock_slowpath(); // XXX do not assume extra _QR_BIAS } At which point you have the guarantee that read-count == 0, and you can write: static inline void queue_write_unlock(struct qrwlock *lock) { smp_store_release(lock-cnts.rwc, 0); } No? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: > This patch modifies the queue_write_unlock() function to use the new > smp_store_release() function (currently in tip). It also removes the > temporary implementation of smp_load_acquire() and smp_store_release() > function in qrwlock.c. > > This patch will use atomic subtraction instead if the writer field is > not atomic. > > Signed-off-by: Waiman Long Reviewed-by: Paul E. McKenney > --- > include/asm-generic/qrwlock.h | 10 ++ > kernel/locking/qrwlock.c | 34 -- > 2 files changed, 6 insertions(+), 38 deletions(-) > > diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h > index 5abb6ca..68f488b 100644 > --- a/include/asm-generic/qrwlock.h > +++ b/include/asm-generic/qrwlock.h > @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock > *lock) > static inline void queue_write_unlock(struct qrwlock *lock) > { > /* > - * Make sure that none of the critical section will be leaked out. > + * If the writer field is atomic, it can be cleared directly. > + * Otherwise, an atomic subtraction will be used to clear it. >*/ > - smp_mb__before_clear_bit(); > - ACCESS_ONCE(lock->cnts.writer) = 0; > - smp_mb__after_clear_bit(); > + if (__native_word(lock->cnts.writer)) > + smp_store_release(>cnts.writer, 0); > + else > + atomic_sub(_QW_LOCKED, >cnts.rwa); > } > > /* > diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c > index 053be4d..2727188 100644 > --- a/kernel/locking/qrwlock.c > +++ b/kernel/locking/qrwlock.c > @@ -47,40 +47,6 @@ > # define arch_mutex_cpu_relax() cpu_relax() > #endif > > -#ifndef smp_load_acquire > -# ifdef CONFIG_X86 > -# define smp_load_acquire(p) \ > - ({ \ > - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ > - barrier(); \ > - ___p1; \ > - }) > -# else > -# define smp_load_acquire(p) \ > - ({ \ > - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ > - smp_mb(); \ > - ___p1; \ > - }) > -# endif > -#endif > - > -#ifndef smp_store_release > -# ifdef CONFIG_X86 > -# define smp_store_release(p, v) \ > - do {\ > - barrier(); \ > - ACCESS_ONCE(*p) = v;\ > - } while (0) > -# else > -# define smp_store_release(p, v) \ > - do {\ > - smp_mb(); \ > - ACCESS_ONCE(*p) = v;\ > - } while (0) > -# endif > -#endif > - > /* > * If an xadd (exchange-add) macro isn't available, simulate one with > * the atomic_add_return() function. > -- > 1.7.1 > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote: This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long waiman.l...@hp.com Reviewed-by: Paul E. McKenney paul...@linux.vnet.ibm.com --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* - * Make sure that none of the critical section will be leaked out. + * If the writer field is atomic, it can be cleared directly. + * Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock-cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock-cnts.writer)) + smp_store_release(lock-cnts.writer, 0); + else + atomic_sub(_QW_LOCKED, lock-cnts.rwa); } /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 053be4d..2727188 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -47,40 +47,6 @@ # define arch_mutex_cpu_relax() cpu_relax() #endif -#ifndef smp_load_acquire -# ifdef CONFIG_X86 -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - barrier(); \ - ___p1; \ - }) -# else -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - smp_mb(); \ - ___p1; \ - }) -# endif -#endif - -#ifndef smp_store_release -# ifdef CONFIG_X86 -# define smp_store_release(p, v) \ - do {\ - barrier(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# else -# define smp_store_release(p, v) \ - do {\ - smp_mb(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# endif -#endif - /* * If an xadd (exchange-add) macro isn't available, simulate one with * the atomic_add_return() function. -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* -* Make sure that none of the critical section will be leaked out. +* If the writer field is atomic, it can be cleared directly. +* Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock->cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock->cnts.writer)) + smp_store_release(>cnts.writer, 0); + else + atomic_sub(_QW_LOCKED, >cnts.rwa); } /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 053be4d..2727188 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -47,40 +47,6 @@ # define arch_mutex_cpu_relax() cpu_relax() #endif -#ifndef smp_load_acquire -# ifdef CONFIG_X86 -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - barrier(); \ - ___p1; \ - }) -# else -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - smp_mb(); \ - ___p1; \ - }) -# endif -#endif - -#ifndef smp_store_release -# ifdef CONFIG_X86 -# define smp_store_release(p, v) \ - do {\ - barrier(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# else -# define smp_store_release(p, v) \ - do {\ - smp_mb(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# endif -#endif - /* * If an xadd (exchange-add) macro isn't available, simulate one with * the atomic_add_return() function. -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH v9 4/5] qrwlock: Use smp_store_release() in write_unlock()
This patch modifies the queue_write_unlock() function to use the new smp_store_release() function (currently in tip). It also removes the temporary implementation of smp_load_acquire() and smp_store_release() function in qrwlock.c. This patch will use atomic subtraction instead if the writer field is not atomic. Signed-off-by: Waiman Long waiman.l...@hp.com --- include/asm-generic/qrwlock.h | 10 ++ kernel/locking/qrwlock.c | 34 -- 2 files changed, 6 insertions(+), 38 deletions(-) diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h index 5abb6ca..68f488b 100644 --- a/include/asm-generic/qrwlock.h +++ b/include/asm-generic/qrwlock.h @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock) static inline void queue_write_unlock(struct qrwlock *lock) { /* -* Make sure that none of the critical section will be leaked out. +* If the writer field is atomic, it can be cleared directly. +* Otherwise, an atomic subtraction will be used to clear it. */ - smp_mb__before_clear_bit(); - ACCESS_ONCE(lock-cnts.writer) = 0; - smp_mb__after_clear_bit(); + if (__native_word(lock-cnts.writer)) + smp_store_release(lock-cnts.writer, 0); + else + atomic_sub(_QW_LOCKED, lock-cnts.rwa); } /* diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index 053be4d..2727188 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -47,40 +47,6 @@ # define arch_mutex_cpu_relax() cpu_relax() #endif -#ifndef smp_load_acquire -# ifdef CONFIG_X86 -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - barrier(); \ - ___p1; \ - }) -# else -# define smp_load_acquire(p) \ - ({ \ - typeof(*p) ___p1 = ACCESS_ONCE(*p); \ - smp_mb(); \ - ___p1; \ - }) -# endif -#endif - -#ifndef smp_store_release -# ifdef CONFIG_X86 -# define smp_store_release(p, v) \ - do {\ - barrier(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# else -# define smp_store_release(p, v) \ - do {\ - smp_mb(); \ - ACCESS_ONCE(*p) = v;\ - } while (0) -# endif -#endif - /* * If an xadd (exchange-add) macro isn't available, simulate one with * the atomic_add_return() function. -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/