Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-14 Thread Peter Zijlstra
On Tue, Jun 14, 2016 at 01:52:53PM +0800, Boqun Feng wrote:
> On Mon, Jun 13, 2016 at 12:45:23PM -0700, Davidlohr Bueso wrote:
> > On Fri, 03 Jun 2016, Pan Xinhui wrote:
> > 
> > > The existing version uses a heavy barrier while only release semantics
> > > is required. So use atomic_sub_return_release instead.
> > > 
> > > Suggested-by: Peter Zijlstra (Intel) 
> > > Signed-off-by: Pan Xinhui 
> > 
> > I just noticed this change in -tip and, while I know that saving a barrier
> > in core spinlock paths is perhaps a worthy exception, I cannot help but
> > wonder if this is the begging of the end for smp__{before,after}_atomic().
> 
> This is surely a good direction I think, that is using _acquire and
> _release primitives to replace those barriers. However, I think we
> should do this carefully, because the _acquire and _release primitives
> are RCpc because they are on PPC, IOW, a ACQUIRE and RELEASE pair is not
> a full barrier nor provides global transivity. I'm worried about there
> are some users depending on the full-barrier semantics, which means we
> must audit each use carefully before we make the change.

Very good point indeed. And yes, the whole RCpc thing, but also the
tricky wandering store on PPC/ARM64 ACQUIRE makes for lots of 'fun' we
can do without.

> Besides, if we want to do the conversion, we'd better have _acquire and
> _release variants for non-value-returning atomic operations.

Indeed, I've been tempted to introduce those before.

> I remember you were working on those variants. How is that going?

Ah, if Davidlohr is working on that, brilliant, less work for me ;-)


Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-14 Thread Peter Zijlstra
On Tue, Jun 14, 2016 at 01:52:53PM +0800, Boqun Feng wrote:
> On Mon, Jun 13, 2016 at 12:45:23PM -0700, Davidlohr Bueso wrote:
> > On Fri, 03 Jun 2016, Pan Xinhui wrote:
> > 
> > > The existing version uses a heavy barrier while only release semantics
> > > is required. So use atomic_sub_return_release instead.
> > > 
> > > Suggested-by: Peter Zijlstra (Intel) 
> > > Signed-off-by: Pan Xinhui 
> > 
> > I just noticed this change in -tip and, while I know that saving a barrier
> > in core spinlock paths is perhaps a worthy exception, I cannot help but
> > wonder if this is the begging of the end for smp__{before,after}_atomic().
> 
> This is surely a good direction I think, that is using _acquire and
> _release primitives to replace those barriers. However, I think we
> should do this carefully, because the _acquire and _release primitives
> are RCpc because they are on PPC, IOW, a ACQUIRE and RELEASE pair is not
> a full barrier nor provides global transivity. I'm worried about there
> are some users depending on the full-barrier semantics, which means we
> must audit each use carefully before we make the change.

Very good point indeed. And yes, the whole RCpc thing, but also the
tricky wandering store on PPC/ARM64 ACQUIRE makes for lots of 'fun' we
can do without.

> Besides, if we want to do the conversion, we'd better have _acquire and
> _release variants for non-value-returning atomic operations.

Indeed, I've been tempted to introduce those before.

> I remember you were working on those variants. How is that going?

Ah, if Davidlohr is working on that, brilliant, less work for me ;-)


Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-13 Thread Boqun Feng
On Mon, Jun 13, 2016 at 12:45:23PM -0700, Davidlohr Bueso wrote:
> On Fri, 03 Jun 2016, Pan Xinhui wrote:
> 
> > The existing version uses a heavy barrier while only release semantics
> > is required. So use atomic_sub_return_release instead.
> > 
> > Suggested-by: Peter Zijlstra (Intel) 
> > Signed-off-by: Pan Xinhui 
> 
> I just noticed this change in -tip and, while I know that saving a barrier
> in core spinlock paths is perhaps a worthy exception, I cannot help but
> wonder if this is the begging of the end for smp__{before,after}_atomic().

This is surely a good direction I think, that is using _acquire and
_release primitives to replace those barriers. However, I think we
should do this carefully, because the _acquire and _release primitives
are RCpc because they are on PPC, IOW, a ACQUIRE and RELEASE pair is not
a full barrier nor provides global transivity. I'm worried about there
are some users depending on the full-barrier semantics, which means we
must audit each use carefully before we make the change.

Besides, if we want to do the conversion, we'd better have _acquire and
_release variants for non-value-returning atomic operations.

I remember you were working on those variants. How is that going?

Regards,
Boqun


signature.asc
Description: PGP signature


Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-13 Thread Boqun Feng
On Mon, Jun 13, 2016 at 12:45:23PM -0700, Davidlohr Bueso wrote:
> On Fri, 03 Jun 2016, Pan Xinhui wrote:
> 
> > The existing version uses a heavy barrier while only release semantics
> > is required. So use atomic_sub_return_release instead.
> > 
> > Suggested-by: Peter Zijlstra (Intel) 
> > Signed-off-by: Pan Xinhui 
> 
> I just noticed this change in -tip and, while I know that saving a barrier
> in core spinlock paths is perhaps a worthy exception, I cannot help but
> wonder if this is the begging of the end for smp__{before,after}_atomic().

This is surely a good direction I think, that is using _acquire and
_release primitives to replace those barriers. However, I think we
should do this carefully, because the _acquire and _release primitives
are RCpc because they are on PPC, IOW, a ACQUIRE and RELEASE pair is not
a full barrier nor provides global transivity. I'm worried about there
are some users depending on the full-barrier semantics, which means we
must audit each use carefully before we make the change.

Besides, if we want to do the conversion, we'd better have _acquire and
_release variants for non-value-returning atomic operations.

I remember you were working on those variants. How is that going?

Regards,
Boqun


signature.asc
Description: PGP signature


Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-13 Thread Davidlohr Bueso

On Fri, 03 Jun 2016, Pan Xinhui wrote:


The existing version uses a heavy barrier while only release semantics
is required. So use atomic_sub_return_release instead.

Suggested-by: Peter Zijlstra (Intel) 
Signed-off-by: Pan Xinhui 


I just noticed this change in -tip and, while I know that saving a barrier
in core spinlock paths is perhaps a worthy exception, I cannot help but
wonder if this is the begging of the end for smp__{before,after}_atomic().


Re: [PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-13 Thread Davidlohr Bueso

On Fri, 03 Jun 2016, Pan Xinhui wrote:


The existing version uses a heavy barrier while only release semantics
is required. So use atomic_sub_return_release instead.

Suggested-by: Peter Zijlstra (Intel) 
Signed-off-by: Pan Xinhui 


I just noticed this change in -tip and, while I know that saving a barrier
in core spinlock paths is perhaps a worthy exception, I cannot help but
wonder if this is the begging of the end for smp__{before,after}_atomic().


[PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-03 Thread Pan Xinhui
The existing version uses a heavy barrier while only release semantics
is required. So use atomic_sub_return_release instead.

Suggested-by: Peter Zijlstra (Intel) 
Signed-off-by: Pan Xinhui 
---
 include/asm-generic/qspinlock.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 35a52a8..8947cd2 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -92,10 +92,9 @@ static __always_inline void queued_spin_lock(struct 
qspinlock *lock)
 static __always_inline void queued_spin_unlock(struct qspinlock *lock)
 {
/*
-* smp_mb__before_atomic() in order to guarantee release semantics
-*/
-   smp_mb__before_atomic();
-   atomic_sub(_Q_LOCKED_VAL, >val);
+   * unlock() need release semantics
+   */
+   (void)atomic_sub_return_release(_Q_LOCKED_VAL, >val);
 }
 #endif
 
-- 
1.9.1



[PATCH] locking/qspinlock: Use atomic_sub_return_release in queued_spin_unlock

2016-06-03 Thread Pan Xinhui
The existing version uses a heavy barrier while only release semantics
is required. So use atomic_sub_return_release instead.

Suggested-by: Peter Zijlstra (Intel) 
Signed-off-by: Pan Xinhui 
---
 include/asm-generic/qspinlock.h | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 35a52a8..8947cd2 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -92,10 +92,9 @@ static __always_inline void queued_spin_lock(struct 
qspinlock *lock)
 static __always_inline void queued_spin_unlock(struct qspinlock *lock)
 {
/*
-* smp_mb__before_atomic() in order to guarantee release semantics
-*/
-   smp_mb__before_atomic();
-   atomic_sub(_Q_LOCKED_VAL, >val);
+   * unlock() need release semantics
+   */
+   (void)atomic_sub_return_release(_Q_LOCKED_VAL, >val);
 }
 #endif
 
-- 
1.9.1