[patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Nick Piggin
lwsync is the recommended method of store/store ordering on caching enabled memory. For those subarchs which have lwsync, use it rather than eieio for smp_wmb. Signed-off-by: Nick Piggin [EMAIL PROTECTED] --- Index: linux-2.6/include/asm-powerpc/system.h

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Benjamin Herrenschmidt
On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote: lwsync is the recommended method of store/store ordering on caching enabled memory. For those subarchs which have lwsync, use it rather than eieio for smp_wmb. Yuck... existence of lwsync depends on the processor at boot time... Ben.

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Nick Piggin
On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote: lwsync is the recommended method of store/store ordering on caching enabled memory. For those subarchs which have lwsync, use it rather than eieio for smp_wmb.

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Benjamin Herrenschmidt
On Wed, 2008-05-21 at 17:34 +0200, Nick Piggin wrote: On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote: lwsync is the recommended method of store/store ordering on caching enabled memory. For those

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Nick Piggin
On Wed, May 21, 2008 at 11:43:00AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 17:34 +0200, Nick Piggin wrote: On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote: lwsync is the recommended

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Nick Piggin
On Wed, May 21, 2008 at 11:43:00AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 17:34 +0200, Nick Piggin wrote: On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote: On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote: lwsync is the recommended

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Benjamin Herrenschmidt
On Wed, 2008-05-21 at 17:47 +0200, Nick Piggin wrote: OK, but I just don't understand what the problem is... your synch.h has #ifdef __powerpc64__ #define __SUBARCH_HAS_LWSYNC #endif #ifdef __SUBARCH_HAS_LWSYNC #define LWSYNC lwsync #else #define LWSYNC sync

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Segher Boessenkool
From memory, I measured lwsync is 5 times faster than eieio on a dual G5. This was on a simple microbenchmark that made use of smp_wmb for store ordering, but it did not involve any IO access (which presumably would disadvantage eieio further). This is very much specific to your particular

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Segher Boessenkool
+#ifdef __SUBARCH_HAS_LWSYNC +#define SMPWMB lwsync +#else +#define SMPWMB eieio +#endif + #define smp_mb() mb() #define smp_rmb() rmb() -#define smp_wmb() eieio() +#define smp_wmb() __asm__ __volatile__ (__stringify(SMPWMB) : : :memory) SMPWMB is used only

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Benjamin Herrenschmidt
On Wed, 2008-05-21 at 22:12 +0200, Segher Boessenkool wrote: No idea about POWER6; for CBE, the backend works similar to the 970 one. Given that the architecture says to use lwsync for cases like this, it would be very surprising if it performed (much) worse than eieio, eh? ;-) So I think

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Segher Boessenkool
This is mostly useless then since lwsync is just a sync to a processor that doesn't know it (it's a sync with a reservd bit set) :-) Or it's just to make gas happy if you specify a processor type that doesn't have lwsync ? GAS doesn't care (I tried with -Wa,-m405). Support for this insn was

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Segher Boessenkool
The main question is do we care if the downgrade to sync on power3 hurts performances (and does it ?) and what do we do for 32 bits as currently, no 32 bits implementation has lwsync afaik (though that might not be true for long). Some time ago, I benchmarked (*) a loop of stw;sync vs.

Re: [patch 2/2] powerpc: optimise smp_wmb

2008-05-21 Thread Nick Piggin
On Wed, May 21, 2008 at 10:12:03PM +0200, Segher Boessenkool wrote: From memory, I measured lwsync is 5 times faster than eieio on a dual G5. This was on a simple microbenchmark that made use of smp_wmb for store ordering, but it did not involve any IO access (which presumably would