Re: [PATCH v2 06/32] s390: reuse asm-generic/barrier.h
On Mon, 4 Jan 2016 22:42:44 +0200 "Michael S. Tsirkin"wrote: > On Mon, Jan 04, 2016 at 04:03:39PM +0100, Martin Schwidefsky wrote: > > On Mon, 4 Jan 2016 14:20:42 +0100 > > Peter Zijlstra wrote: > > > > > On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote: > > > > On s390 read_barrier_depends, smp_read_barrier_depends > > > > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the > > > > asm-generic variants exactly. Drop the local definitions and pull in > > > > asm-generic/barrier.h instead. > > > > > > > > This is in preparation to refactoring this code area. > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > Acked-by: Arnd Bergmann > > > > --- > > > > arch/s390/include/asm/barrier.h | 10 ++ > > > > 1 file changed, 2 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > > b/arch/s390/include/asm/barrier.h > > > > index 7ffd0b1..c358c31 100644 > > > > --- a/arch/s390/include/asm/barrier.h > > > > +++ b/arch/s390/include/asm/barrier.h > > > > @@ -30,14 +30,6 @@ > > > > #define smp_rmb() rmb() > > > > #define smp_wmb() wmb() > > > > > > > > -#define read_barrier_depends() do { } while (0) > > > > -#define smp_read_barrier_depends() do { } while (0) > > > > - > > > > -#define smp_mb__before_atomic()smp_mb() > > > > -#define smp_mb__after_atomic() smp_mb() > > > > > > As per: > > > > > > lkml.kernel.org/r/20150921112252.3c2937e1@mschwide > > > > > > s390 should change this to barrier() instead of smp_mb() and hence > > > should not use the generic versions. > > > > Yes, we wanted to simplify this. Thanks for the reminder, I'll queue > > a patch. > > Could you base on my patchset maybe, to avoid conflicts, > and I'll merge it? > Or if it's just replacing these 2 with barrier() I can do this > myself easily. Probably the easiest solution if you do the patch yourself and include it in your patch set. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Mon, 4 Jan 2016 22:18:58 +0200 "Michael S. Tsirkin"wrote: > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote: > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > > > This defines __smp_xxx barriers for s390, > > > for use by virtualization. > > > > > > Some smp_xxx barriers are removed as they are > > > defined correctly by asm-generic/barriers.h > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers > > > unconditionally on this architecture. > > > > > > Signed-off-by: Michael S. Tsirkin > > > Acked-by: Arnd Bergmann > > > --- > > > arch/s390/include/asm/barrier.h | 15 +-- > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > b/arch/s390/include/asm/barrier.h > > > index c358c31..fbd25b2 100644 > > > --- a/arch/s390/include/asm/barrier.h > > > +++ b/arch/s390/include/asm/barrier.h > > > @@ -26,18 +26,21 @@ > > > #define wmb()barrier() > > > #define dma_rmb()mb() > > > #define dma_wmb()mb() > > > -#define smp_mb() mb() > > > -#define smp_rmb()rmb() > > > -#define smp_wmb()wmb() > > > - > > > -#define smp_store_release(p, v) > > > \ > > > +#define __smp_mb() mb() > > > +#define __smp_rmb() rmb() > > > +#define __smp_wmb() wmb() > > > +#define smp_mb() __smp_mb() > > > +#define smp_rmb()__smp_rmb() > > > +#define smp_wmb()__smp_wmb() > > > > Why define the smp_*mb() primitives here? Would not the inclusion of > > asm-generic/barrier.h do this? > > No because the generic one is a nop on !SMP, this one isn't. > > Pls note this patch is just reordering code without making > functional changes. > And at the moment, on s390 smp_xxx barriers are always non empty. The s390 kernel is SMP to 99.99%, we just didn't bother with a non-smp variant for the memory-barriers. If the generic header is used we'd get the non-smp version for free. It will save a small amount of text space for CONFIG_SMP=n. > Some of this could be sub-optimal, but > since on s390 Linux always runs on a hypervisor, > I am not sure it's safe to use the generic version - > in other words, it just might be that for s390 smp_ and virt_ > barriers must be equivalent. The definition of the memory barriers is independent from the fact if the system is running on an hypervisor or not. Is there really an architecture where you need special virt_xxx barriers?!? -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 15/32] powerpc: define __smp_xxx
On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote: > Hi Michael, > > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote: > > This defines __smp_xxx barriers for powerpc > > for use by virtualization. > > > > smp_xxx barriers are removed as they are > > defined correctly by asm-generic/barriers.h I think this is the part that was missed in review. > > This reduces the amount of arch-specific boiler-plate code. > > > > Signed-off-by: Michael S. Tsirkin> > Acked-by: Arnd Bergmann > > --- > > arch/powerpc/include/asm/barrier.h | 24 > > 1 file changed, 8 insertions(+), 16 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/barrier.h > > b/arch/powerpc/include/asm/barrier.h > > index 980ad0c..c0deafc 100644 > > --- a/arch/powerpc/include/asm/barrier.h > > +++ b/arch/powerpc/include/asm/barrier.h > > @@ -44,19 +44,11 @@ > > #define dma_rmb() __lwsync() > > #define dma_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : > > :"memory") > > > > -#ifdef CONFIG_SMP > > -#define smp_lwsync() __lwsync() > > +#define __smp_lwsync() __lwsync() > > > > so __smp_lwsync() is always mapped to lwsync, right? Yes. > > -#define smp_mb() mb() > > -#define smp_rmb() __lwsync() > > -#define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : : > > :"memory") > > -#else > > -#define smp_lwsync() barrier() > > - > > -#define smp_mb() barrier() > > -#define smp_rmb() barrier() > > -#define smp_wmb() barrier() > > -#endif /* CONFIG_SMP */ > > +#define __smp_mb() mb() > > +#define __smp_rmb()__lwsync() > > +#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : > > : :"memory") > > > > /* > > * This is a barrier which prevents following instructions from being > > @@ -67,18 +59,18 @@ > > #define data_barrier(x)\ > > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory"); > > > > -#define smp_store_release(p, v) > > \ > > +#define __smp_store_release(p, v) > > \ > > do { > > \ > > compiletime_assert_atomic_type(*p); \ > > - smp_lwsync(); \ > > + __smp_lwsync(); \ > > , therefore this will emit an lwsync no matter SMP or UP. Absolutely. But smp_store_release (without __) will not. Please note I did test this: for ppc code before and after this patch generates exactly the same binary on SMP and UP. > Another thing is that smp_lwsync() may have a third user(other than > smp_load_acquire() and smp_store_release()): > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877 > > I'm OK to change my patch accordingly, but do we really want > smp_lwsync() get involved in this cleanup? If I understand you > correctly, this cleanup focuses on external API like smp_{r,w,}mb(), > while smp_lwsync() is internal to PPC. > > Regards, > Boqun I think you missed the leading ___ :) smp_store_release is external and it needs __smp_lwsync as defined here. I can duplicate some code and have smp_lwsync *not* call __smp_lwsync but why do this? Still, if you prefer it this way, please let me know. > > WRITE_ONCE(*p, v); \ > > } while (0) > > > > -#define smp_load_acquire(p) > > \ > > +#define __smp_load_acquire(p) > > \ > > ({ \ > > typeof(*p) ___p1 = READ_ONCE(*p); \ > > compiletime_assert_atomic_type(*p); \ > > - smp_lwsync(); \ > > + __smp_lwsync(); \ > > ___p1; \ > > }) > > > > -- > > MST > > > > -- > > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > > the body of a message to majord...@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > Please read the FAQ at http://www.tux.org/lkml/ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote: > On Mon, 4 Jan 2016 22:18:58 +0200 > "Michael S. Tsirkin"wrote: > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote: > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > > > > This defines __smp_xxx barriers for s390, > > > > for use by virtualization. > > > > > > > > Some smp_xxx barriers are removed as they are > > > > defined correctly by asm-generic/barriers.h > > > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers > > > > unconditionally on this architecture. > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > Acked-by: Arnd Bergmann > > > > --- > > > > arch/s390/include/asm/barrier.h | 15 +-- > > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > > b/arch/s390/include/asm/barrier.h > > > > index c358c31..fbd25b2 100644 > > > > --- a/arch/s390/include/asm/barrier.h > > > > +++ b/arch/s390/include/asm/barrier.h > > > > @@ -26,18 +26,21 @@ > > > > #define wmb() barrier() > > > > #define dma_rmb() mb() > > > > #define dma_wmb() mb() > > > > -#define smp_mb() mb() > > > > -#define smp_rmb() rmb() > > > > -#define smp_wmb() wmb() > > > > - > > > > -#define smp_store_release(p, v) > > > > \ > > > > +#define __smp_mb() mb() > > > > +#define __smp_rmb()rmb() > > > > +#define __smp_wmb()wmb() > > > > +#define smp_mb() __smp_mb() > > > > +#define smp_rmb() __smp_rmb() > > > > +#define smp_wmb() __smp_wmb() > > > > > > Why define the smp_*mb() primitives here? Would not the inclusion of > > > asm-generic/barrier.h do this? > > > > No because the generic one is a nop on !SMP, this one isn't. > > > > Pls note this patch is just reordering code without making > > functional changes. > > And at the moment, on s390 smp_xxx barriers are always non empty. > > The s390 kernel is SMP to 99.99%, we just didn't bother with a > non-smp variant for the memory-barriers. If the generic header > is used we'd get the non-smp version for free. It will save a > small amount of text space for CONFIG_SMP=n. OK, so I'll queue a patch to do this then? Just to make sure: the question would be, are smp_xxx barriers ever used in s390 arch specific code to flush in/out memory accesses for synchronization with the hypervisor? I went over s390 arch code and it seems to me the answer is no (except of course for virtio). But I also see a lot of weirdness on this architecture. I found these calls: arch/s390/include/asm/bitops.h: smp_mb__before_atomic(); arch/s390/include/asm/bitops.h: smp_mb(); Not used in arch specific code so this is likely OK. arch/s390/kernel/vdso.c:smp_mb(); Looking at Author: Christian Borntraeger Date: Fri Sep 11 16:23:06 2015 +0200 s390/vdso: use correct memory barrier By definition smp_wmb only orders writes against writes. (Finish all previous writes, and do not start any future write). To protect the vdso init code against early reads on other CPUs, let's use a full smp_mb at the end of vdso init. As right now smp_wmb is implemented as full serialization, this needs no stable backport, but this change will be necessary if we reimplement smp_wmb. ok from hypervisor point of view, but it's also strange: 1. why isn't this paired with another mb somewhere? this seems to violate barrier pairing rules. 2. how does smp_mb protect against early reads on other CPUs? It normally does not: it orders reads from this CPU versus writes from same CPU. But init code does not appear to read anything. Maybe this is some s390 specific trick? I could not figure out the above commit. arch/s390/kvm/kvm-s390.c: smp_mb(); Does not appear to be paired with anything. arch/s390/lib/spinlock.c: smp_mb(); arch/s390/lib/spinlock.c: smp_mb(); Seems ok, and appears paired properly. Just to make sure - spinlock is not paravirtualized on s390, is it? rch/s390/kernel/time.c:smp_wmb(); arch/s390/kernel/time.c:smp_wmb(); arch/s390/kernel/time.c:smp_wmb(); arch/s390/kernel/time.c:smp_wmb(); It's all around vdso, so I'm guessing userspace is using this, this is why there's no pairing. > > Some of this could be sub-optimal, but > > since on s390 Linux always runs on a hypervisor, > > I am not sure it's safe to use the generic version - > > in other words, it just might be that for s390 smp_ and virt_ > > barriers must be equivalent. >
Re: [PATCH v2 15/32] powerpc: define __smp_xxx
On Tue, Jan 05, 2016 at 10:51:17AM +0200, Michael S. Tsirkin wrote: > On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote: > > Hi Michael, > > > > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote: > > > This defines __smp_xxx barriers for powerpc > > > for use by virtualization. > > > > > > smp_xxx barriers are removed as they are > > > defined correctly by asm-generic/barriers.h > > I think this is the part that was missed in review. > Yes, I realized my mistake after reread the series. But smp_lwsync() is not defined in asm-generic/barriers.h, right? > > > This reduces the amount of arch-specific boiler-plate code. > > > > > > Signed-off-by: Michael S. Tsirkin> > > Acked-by: Arnd Bergmann > > > --- > > > arch/powerpc/include/asm/barrier.h | 24 > > > 1 file changed, 8 insertions(+), 16 deletions(-) > > > > > > diff --git a/arch/powerpc/include/asm/barrier.h > > > b/arch/powerpc/include/asm/barrier.h > > > index 980ad0c..c0deafc 100644 > > > --- a/arch/powerpc/include/asm/barrier.h > > > +++ b/arch/powerpc/include/asm/barrier.h > > > @@ -44,19 +44,11 @@ > > > #define dma_rmb()__lwsync() > > > #define dma_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > : :"memory") > > > > > > -#ifdef CONFIG_SMP > > > -#define smp_lwsync() __lwsync() > > > +#define __smp_lwsync() __lwsync() > > > > > > > so __smp_lwsync() is always mapped to lwsync, right? > > Yes. > > > > -#define smp_mb() mb() > > > -#define smp_rmb()__lwsync() > > > -#define smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > : :"memory") > > > -#else > > > -#define smp_lwsync() barrier() > > > - > > > -#define smp_mb() barrier() > > > -#define smp_rmb()barrier() > > > -#define smp_wmb()barrier() > > > -#endif /* CONFIG_SMP */ > > > +#define __smp_mb() mb() > > > +#define __smp_rmb() __lwsync() > > > +#define __smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > : :"memory") > > > > > > /* > > > * This is a barrier which prevents following instructions from being > > > @@ -67,18 +59,18 @@ > > > #define data_barrier(x) \ > > > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory"); > > > > > > -#define smp_store_release(p, v) > > > \ > > > +#define __smp_store_release(p, v) > > > \ > > > do { > > > \ > > > compiletime_assert_atomic_type(*p); \ > > > - smp_lwsync(); \ > > > + __smp_lwsync(); \ > > > > , therefore this will emit an lwsync no matter SMP or UP. > > Absolutely. But smp_store_release (without __) will not. > > Please note I did test this: for ppc code before and after > this patch generates exactly the same binary on SMP and UP. > Yes, you're right, sorry for my mistake... > > > Another thing is that smp_lwsync() may have a third user(other than > > smp_load_acquire() and smp_store_release()): > > > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877 > > > > I'm OK to change my patch accordingly, but do we really want > > smp_lwsync() get involved in this cleanup? If I understand you > > correctly, this cleanup focuses on external API like smp_{r,w,}mb(), > > while smp_lwsync() is internal to PPC. > > > > Regards, > > Boqun > > I think you missed the leading ___ :) > What I mean here was smp_lwsync() was originally internal to PPC, but never mind ;-) > smp_store_release is external and it needs __smp_lwsync as > defined here. > > I can duplicate some code and have smp_lwsync *not* call __smp_lwsync You mean bringing smp_lwsync() back? because I haven't seen you defining in asm-generic/barriers.h in previous patches and you just delete it in this patch. > but why do this? Still, if you prefer it this way, > please let me know. > I think deleting smp_lwsync() is fine, though I need to change atomic variants patches on PPC because of it ;-/ Regards, Boqun > > > WRITE_ONCE(*p, v); \ > > > } while (0) > > > > > > -#define smp_load_acquire(p) > > > \ > > > +#define __smp_load_acquire(p) > > > \ > > > ({ > > > \ > > > typeof(*p) ___p1 = READ_ONCE(*p); \ > > > compiletime_assert_atomic_type(*p); \ > > > - smp_lwsync(); \ > > > + __smp_lwsync(); \ > > > ___p1; \ > > > }) > > > > > > -- > >
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote: > The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data > instructions since it XORs A with X while all the others replace A with > some loaded value. All the BPF JITs fail to clear A if this is used as > the first instruction in a filter. Is x86_64 part of this 'All' subset ? ;) > This was found using american fuzzy > lop. > > Add a helper to determine if A needs to be cleared given the first > instruction in a filter, and use this in the JITs. Except for ARM, the > rest have only been compile-tested. > > Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") > Signed-off-by: Rabin Vincent> --- ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
On Tue, Jan 05, 2016 at 05:36:47PM +0100, Daniel Borkmann wrote: > On 01/05/2016 04:23 PM, Rabin Vincent wrote: > >The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data > >instructions since it XORs A with X while all the others replace A with > >some loaded value. All the BPF JITs fail to clear A if this is used as > >the first instruction in a filter. This was found using american fuzzy > >lop. > > > >Add a helper to determine if A needs to be cleared given the first > >instruction in a filter, and use this in the JITs. Except for ARM, the > >rest have only been compile-tested. > > > >Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") > >Signed-off-by: Rabin Vincent> > Excellent catch, thanks a lot! The fix looks good to me and should > go to -net tree. > > Acked-by: Daniel Borkmann good catch indeed. Classic bpf jits didn't have much love. Great to see this work. Acked-by: Alexei Starovoitov ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 15/32] powerpc: define __smp_xxx
On Tue, Jan 05, 2016 at 05:53:41PM +0800, Boqun Feng wrote: > On Tue, Jan 05, 2016 at 10:51:17AM +0200, Michael S. Tsirkin wrote: > > On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote: > > > Hi Michael, > > > > > > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote: > > > > This defines __smp_xxx barriers for powerpc > > > > for use by virtualization. > > > > > > > > smp_xxx barriers are removed as they are > > > > defined correctly by asm-generic/barriers.h > > > > I think this is the part that was missed in review. > > > > Yes, I realized my mistake after reread the series. But smp_lwsync() is > not defined in asm-generic/barriers.h, right? It isn't because as far as I could tell it is not used outside arch/powerpc/include/asm/barrier.h smp_store_release and smp_load_acquire. And these are now gone. Instead there are __smp_store_release and __smp_load_acquire which call __smp_lwsync. These are only used for virt and on SMP. UP variants are generic - they just call barrier(). > > > > This reduces the amount of arch-specific boiler-plate code. > > > > > > > > Signed-off-by: Michael S. Tsirkin> > > > Acked-by: Arnd Bergmann > > > > --- > > > > arch/powerpc/include/asm/barrier.h | 24 > > > > 1 file changed, 8 insertions(+), 16 deletions(-) > > > > > > > > diff --git a/arch/powerpc/include/asm/barrier.h > > > > b/arch/powerpc/include/asm/barrier.h > > > > index 980ad0c..c0deafc 100644 > > > > --- a/arch/powerpc/include/asm/barrier.h > > > > +++ b/arch/powerpc/include/asm/barrier.h > > > > @@ -44,19 +44,11 @@ > > > > #define dma_rmb() __lwsync() > > > > #define dma_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > > : :"memory") > > > > > > > > -#ifdef CONFIG_SMP > > > > -#define smp_lwsync() __lwsync() > > > > +#define __smp_lwsync() __lwsync() > > > > > > > > > > so __smp_lwsync() is always mapped to lwsync, right? > > > > Yes. > > > > > > -#define smp_mb() mb() > > > > -#define smp_rmb() __lwsync() > > > > -#define smp_wmb() __asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > > : :"memory") > > > > -#else > > > > -#define smp_lwsync() barrier() > > > > - > > > > -#define smp_mb() barrier() > > > > -#define smp_rmb() barrier() > > > > -#define smp_wmb() barrier() > > > > -#endif /* CONFIG_SMP */ > > > > +#define __smp_mb() mb() > > > > +#define __smp_rmb()__lwsync() > > > > +#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : > > > > : :"memory") > > > > > > > > /* > > > > * This is a barrier which prevents following instructions from being > > > > @@ -67,18 +59,18 @@ > > > > #define data_barrier(x)\ > > > > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory"); > > > > > > > > -#define smp_store_release(p, v) > > > > \ > > > > +#define __smp_store_release(p, v) > > > > \ > > > > do { > > > > \ > > > > compiletime_assert_atomic_type(*p); > > > > \ > > > > - smp_lwsync(); > > > > \ > > > > + __smp_lwsync(); > > > > \ > > > > > > , therefore this will emit an lwsync no matter SMP or UP. > > > > Absolutely. But smp_store_release (without __) will not. > > > > Please note I did test this: for ppc code before and after > > this patch generates exactly the same binary on SMP and UP. > > > > Yes, you're right, sorry for my mistake... > > > > > > Another thing is that smp_lwsync() may have a third user(other than > > > smp_load_acquire() and smp_store_release()): > > > > > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877 > > > > > > I'm OK to change my patch accordingly, but do we really want > > > smp_lwsync() get involved in this cleanup? If I understand you > > > correctly, this cleanup focuses on external API like smp_{r,w,}mb(), > > > while smp_lwsync() is internal to PPC. > > > > > > Regards, > > > Boqun > > > > I think you missed the leading ___ :) > > > > What I mean here was smp_lwsync() was originally internal to PPC, but > never mind ;-) > > > smp_store_release is external and it needs __smp_lwsync as > > defined here. > > > > I can duplicate some code and have smp_lwsync *not* call __smp_lwsync > > You mean bringing smp_lwsync() back? because I haven't seen you defining > in asm-generic/barriers.h in previous patches and you just delete it in > this patch. > > > but why do this? Still, if you prefer it this way, > > please let me know. > > > > I think deleting smp_lwsync() is fine, though I need to change atomic > variants patches on PPC because of it ;-/ > > Regards, > Boqun Sorry, I don't understand - why do you have to do anything? I changed
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Tue, Jan 05, 2016 at 04:39:37PM +0100, Christian Borntraeger wrote: > On 01/05/2016 10:30 AM, Michael S. Tsirkin wrote: > > > > > arch/s390/kernel/vdso.c:smp_mb(); > > > > Looking at > > Author: Christian Borntraeger> > Date: Fri Sep 11 16:23:06 2015 +0200 > > > > s390/vdso: use correct memory barrier > > > > By definition smp_wmb only orders writes against writes. (Finish all > > previous writes, and do not start any future write). To protect the > > vdso init code against early reads on other CPUs, let's use a full > > smp_mb at the end of vdso init. As right now smp_wmb is implemented > > as full serialization, this needs no stable backport, but this > > change > > will be necessary if we reimplement smp_wmb. > > > > ok from hypervisor point of view, but it's also strange: > > 1. why isn't this paired with another mb somewhere? > >this seems to violate barrier pairing rules. > > 2. how does smp_mb protect against early reads on other CPUs? > >It normally does not: it orders reads from this CPU versus writes > >from same CPU. But init code does not appear to read anything. > >Maybe this is some s390 specific trick? > > > > I could not figure out the above commit. > > It was probably me misreading the code. I change a wmb into a full mb here > since I was changing the defintion of wmb to a compiler barrier. I tried to > fixup all users of wmb that really pair with other code. I assumed that there > must be some reader (as there was a wmb before) but I could not figure out > which. So I just played safe here. > > But it probably can be removed. > > > arch/s390/kvm/kvm-s390.c: smp_mb(); > > This can go. If you have a patch, I can carry that via the kvms390 tree, > or I will spin a new patch with you as suggested-by. > > Christian I have both, will post shortly. -- MST ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
On 01/05/2016 05:03 PM, Rabin Vincent wrote: On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote: On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote: The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. Is x86_64 part of this 'All' subset ? ;) No, because it's an eBPF JIT. Correct, filter conversion to eBPF clears it already. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
On 01/05/2016 04:23 PM, Rabin Vincent wrote: The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. This was found using american fuzzy lop. Add a helper to determine if A needs to be cleared given the first instruction in a filter, and use this in the JITs. Except for ARM, the rest have only been compile-tested. Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") Signed-off-by: Rabin VincentExcellent catch, thanks a lot! The fix looks good to me and should go to -net tree. Acked-by: Daniel Borkmann If you're interested, feel free to add a small test case for the SKF_AD_ALU_XOR_X issue to lib/test_bpf.c for -net-next tree. Thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 22/32] s390: define __smp_xxx
On 01/05/2016 10:30 AM, Michael S. Tsirkin wrote: > > arch/s390/kernel/vdso.c:smp_mb(); > > Looking at > Author: Christian Borntraeger> Date: Fri Sep 11 16:23:06 2015 +0200 > > s390/vdso: use correct memory barrier > > By definition smp_wmb only orders writes against writes. (Finish all > previous writes, and do not start any future write). To protect the > vdso init code against early reads on other CPUs, let's use a full > smp_mb at the end of vdso init. As right now smp_wmb is implemented > as full serialization, this needs no stable backport, but this > change > will be necessary if we reimplement smp_wmb. > > ok from hypervisor point of view, but it's also strange: > 1. why isn't this paired with another mb somewhere? >this seems to violate barrier pairing rules. > 2. how does smp_mb protect against early reads on other CPUs? >It normally does not: it orders reads from this CPU versus writes >from same CPU. But init code does not appear to read anything. >Maybe this is some s390 specific trick? > > I could not figure out the above commit. It was probably me misreading the code. I change a wmb into a full mb here since I was changing the defintion of wmb to a compiler barrier. I tried to fixup all users of wmb that really pair with other code. I assumed that there must be some reader (as there was a wmb before) but I could not figure out which. So I just played safe here. But it probably can be removed. > arch/s390/kvm/kvm-s390.c: smp_mb(); This can go. If you have a patch, I can carry that via the kvms390 tree, or I will spin a new patch with you as suggested-by. Christian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [alsa-devel] [PATCH 1/3] ASoC: fsl_ssi: mark SACNT register volatile
Timur, On Thu, Dec 24, 2015 at 2:12 PM, Timur Tabiwrote: > On Sun, Dec 20, 2015 at 2:30 PM, Maciej S. Szmigiero > wrote: >> SACNT register should be marked volatile since >> its WR and RD bits are cleared by SSI after >> completing the relevant operation. >> This unbreaks AC'97 register access. >> >> Fixes: 05cf237972fe ("ASoC: fsl_ssi: Add driver suspend and resume to >> support MEGA Fast") >> >> Signed-off-by: Maciej S. Szmigiero > > These patches seem okay, but can we hold off merging them until I get > back from vacation and have a chance to review them? Have you had a chance to review these patches? ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v4 2/9] ppc64le FTRACE_WITH_REGS implementation
On Wed, Dec 02, 2015 at 09:18:05AM +1100, Michael Ellerman wrote: > > I (still) haven't had a chance to have a good look at it, but I won't this > week > anyway. So post v5 and hopefully I can review that and it will be perfect :) The perfect v5 is there now, for 4 weeks minus the holiday season, and no further problems arose. Independently verified by Petr Mladek -- don't forget his high-level fix. Torsten ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Tue, 5 Jan 2016 15:04:43 +0200 "Michael S. Tsirkin"wrote: > On Tue, Jan 05, 2016 at 01:08:52PM +0100, Martin Schwidefsky wrote: > > On Tue, 5 Jan 2016 11:30:19 +0200 > > "Michael S. Tsirkin" wrote: > > > > > On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote: > > > > On Mon, 4 Jan 2016 22:18:58 +0200 > > > > "Michael S. Tsirkin" wrote: > > > > > > > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote: > > > > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > > > > > > > This defines __smp_xxx barriers for s390, > > > > > > > for use by virtualization. > > > > > > > > > > > > > > Some smp_xxx barriers are removed as they are > > > > > > > defined correctly by asm-generic/barriers.h > > > > > > > > > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers > > > > > > > unconditionally on this architecture. > > > > > > > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > > > > Acked-by: Arnd Bergmann > > > > > > > --- > > > > > > > arch/s390/include/asm/barrier.h | 15 +-- > > > > > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > > > > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > > > > > b/arch/s390/include/asm/barrier.h > > > > > > > index c358c31..fbd25b2 100644 > > > > > > > --- a/arch/s390/include/asm/barrier.h > > > > > > > +++ b/arch/s390/include/asm/barrier.h > > > > > > > @@ -26,18 +26,21 @@ > > > > > > > #define wmb()barrier() > > > > > > > #define dma_rmb()mb() > > > > > > > #define dma_wmb()mb() > > > > > > > -#define smp_mb() mb() > > > > > > > -#define smp_rmb()rmb() > > > > > > > -#define smp_wmb()wmb() > > > > > > > - > > > > > > > -#define smp_store_release(p, v) > > > > > > > \ > > > > > > > +#define __smp_mb() mb() > > > > > > > +#define __smp_rmb() rmb() > > > > > > > +#define __smp_wmb() wmb() > > > > > > > +#define smp_mb() __smp_mb() > > > > > > > +#define smp_rmb()__smp_rmb() > > > > > > > +#define smp_wmb()__smp_wmb() > > > > > > > > > > > > Why define the smp_*mb() primitives here? Would not the inclusion of > > > > > > asm-generic/barrier.h do this? > > > > > > > > > > No because the generic one is a nop on !SMP, this one isn't. > > > > > > > > > > Pls note this patch is just reordering code without making > > > > > functional changes. > > > > > And at the moment, on s390 smp_xxx barriers are always non empty. > > > > > > > > The s390 kernel is SMP to 99.99%, we just didn't bother with a > > > > non-smp variant for the memory-barriers. If the generic header > > > > is used we'd get the non-smp version for free. It will save a > > > > small amount of text space for CONFIG_SMP=n. > > > > > > OK, so I'll queue a patch to do this then? > > > > Yes please. > > OK, I'll add a patch on top in v3. Good, with this addition: Acked-by: Martin Schwidefsky > > > Just to make sure: the question would be, are smp_xxx barriers ever used > > > in s390 arch specific code to flush in/out memory accesses for > > > synchronization with the hypervisor? > > > > > > I went over s390 arch code and it seems to me the answer is no > > > (except of course for virtio). > > > > Correct. Guest to host communication either uses instructions which > > imply a memory barrier or QDIO which uses atomics. > > And atomics imply a barrier on s390, right? Yes they do. > > > But I also see a lot of weirdness on this architecture. > > > > Mostly historical, s390 actually is one of the easiest architectures in > > regard to memory barriers. > > > > > I found these calls: > > > > > > arch/s390/include/asm/bitops.h: smp_mb__before_atomic(); > > > arch/s390/include/asm/bitops.h: smp_mb(); > > > > > > Not used in arch specific code so this is likely OK. > > > > This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb > > and smp_mb__before_atomic are used in clear_bit_unlock and > > __clear_bit_unlock which are 1:1 copies from the code in > > include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs > > from the generic implementation. > > something to keep in mind, but > I'd rather not touch bitops at the moment - this patchset is already too big. With the conversion smp_mb__before_atomic to a barrier() it does the correct thing. I don't think that any chance is necessary. -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 17/32] arm: define __smp_xxx
On Mon, Jan 04, 2016 at 01:59:34PM +, Russell King - ARM Linux wrote: > On Mon, Jan 04, 2016 at 02:54:20PM +0100, Peter Zijlstra wrote: > > On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote: > > > On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote: > > > > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux > > > > wrote: > > > > > > > > My only concern is that it gives people an additional handle onto a > > > > > "new" set of barriers - just because they're prefixed with __* > > > > > unfortunately doesn't stop anyone from using it (been there with > > > > > other arch stuff before.) > > > > > > > > > > I wonder whether we should consider making the smp memory barriers > > > > > inline functions, so these __smp_xxx() variants can be undef'd > > > > > afterwards, thereby preventing drivers getting their hands on these > > > > > new macros? > > > > > > > > That'd be tricky to do cleanly since asm-generic depends on > > > > ifndef to add generic variants where needed. > > > > > > > > But it would be possible to add a checkpatch test for this. > > > > > > Wasn't the whole purpose of these things for 'drivers' (namely > > > virtio/xen hypervisor interaction) to use these? > > > > Ah, I see, you add virt_*mb() stuff later on for that use case. > > > > So, assuming everybody does include asm-generic/barrier.h, you could > > simply #undef the __smp version at the end of that, once we've generated > > all the regular primitives from it, no? > > Not so simple - that's why I mentioned using inline functions. > > The new smp_* _macros_ are: > > +#define smp_mb() __smp_mb() > > which means if we simply #undef __smp_mb(), smp_mb() then points at > something which is no longer available, and we'll end up with errors > saying that __smp_mb() doesn't exist. > > My suggestion was to change: > > #ifndef smp_mb > #define smp_mb() __smp_mb() > #endif > > to: > > #ifndef smp_mb > static inline void smp_mb(void) > { > __smp_mb(); > } > #endif > > which then means __smp_mb() and friends can be #undef'd afterwards. Absolutely, I got it. The issue is that e.g. tile has: #define __smp_mb__after_atomic()do { } while (0) and this is cheaper than barrier(). For this reason I left #define smp_mb__after_atomic() __smp_mb__after_atomic() in place there. Now, of course I can do (in asm-generic): #ifndef smp_mb__after_atomic static inline void smp_mb__after_atomic(void) { ... } #endif but this seems ugly: architectures do defines, generic version does inline. And that is not all: APIs like smp_store_mb can take a variety of types as arguments so they pretty much must be implemented as macros. Teaching checkpatch.pl to complain about it seems like the cleanest approach. > -- > RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/ > FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up > according to speedtest.net. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v3 0/2] cpufreq: powernv: Redesign the presentation of throttle notification
Shilpasri G Bhatwrites: > In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the > CPU when the chip crosses its thermal and power limits. Currently, > powernv-cpufreq driver detects and reports this event as a console > message. Some boxes may not sustain the max turbo frequency in all > conditions and can be throttled frequently. This can lead to the > flooding of console with throttle messages. So this patchset aims to > redesign the presentation of this event via sysfs counters and > tracepoints. This likely should CC stable@ as we have seen this on some machines, at least in the lab. -- Stewart Smith OPAL Architect, IBM. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
From: Rabin VincentDate: Tue, 5 Jan 2016 16:23:07 +0100 > The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data > instructions since it XORs A with X while all the others replace A with > some loaded value. All the BPF JITs fail to clear A if this is used as > the first instruction in a filter. This was found using american fuzzy > lop. > > Add a helper to determine if A needs to be cleared given the first > instruction in a filter, and use this in the JITs. Except for ARM, the > rest have only been compile-tested. > > Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") > Signed-off-by: Rabin Vincent Applied and queued up for -stable, thanks! ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data instructions since it XORs A with X while all the others replace A with some loaded value. All the BPF JITs fail to clear A if this is used as the first instruction in a filter. This was found using american fuzzy lop. Add a helper to determine if A needs to be cleared given the first instruction in a filter, and use this in the JITs. Except for ARM, the rest have only been compile-tested. Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum") Signed-off-by: Rabin Vincent--- arch/arm/net/bpf_jit_32.c | 16 +--- arch/mips/net/bpf_jit.c | 16 +--- arch/powerpc/net/bpf_jit_comp.c | 13 ++--- arch/sparc/net/bpf_jit_comp.c | 17 ++--- include/linux/filter.h | 19 +++ 5 files changed, 25 insertions(+), 56 deletions(-) diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c index 591f9db3bf40..e153eb065fe4 100644 --- a/arch/arm/net/bpf_jit_32.c +++ b/arch/arm/net/bpf_jit_32.c @@ -187,19 +187,6 @@ static inline int mem_words_used(struct jit_ctx *ctx) return fls(ctx->seen & SEEN_MEM); } -static inline bool is_load_to_a(u16 inst) -{ - switch (inst) { - case BPF_LD | BPF_W | BPF_LEN: - case BPF_LD | BPF_W | BPF_ABS: - case BPF_LD | BPF_H | BPF_ABS: - case BPF_LD | BPF_B | BPF_ABS: - return true; - default: - return false; - } -} - static void jit_fill_hole(void *area, unsigned int size) { u32 *ptr; @@ -211,7 +198,6 @@ static void jit_fill_hole(void *area, unsigned int size) static void build_prologue(struct jit_ctx *ctx) { u16 reg_set = saved_regs(ctx); - u16 first_inst = ctx->skf->insns[0].code; u16 off; #ifdef CONFIG_FRAME_POINTER @@ -241,7 +227,7 @@ static void build_prologue(struct jit_ctx *ctx) emit(ARM_MOV_I(r_X, 0), ctx); /* do not leak kernel data to userspace */ - if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst))) + if (bpf_needs_clear_a(>skf->insns[0])) emit(ARM_MOV_I(r_A, 0), ctx); /* stack space for the BPF_MEM words */ diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c index 77cb27309db2..1a8c96035716 100644 --- a/arch/mips/net/bpf_jit.c +++ b/arch/mips/net/bpf_jit.c @@ -521,19 +521,6 @@ static inline u16 align_sp(unsigned int num) return num; } -static bool is_load_to_a(u16 inst) -{ - switch (inst) { - case BPF_LD | BPF_W | BPF_LEN: - case BPF_LD | BPF_W | BPF_ABS: - case BPF_LD | BPF_H | BPF_ABS: - case BPF_LD | BPF_B | BPF_ABS: - return true; - default: - return false; - } -} - static void save_bpf_jit_regs(struct jit_ctx *ctx, unsigned offset) { int i = 0, real_off = 0; @@ -614,7 +601,6 @@ static unsigned int get_stack_depth(struct jit_ctx *ctx) static void build_prologue(struct jit_ctx *ctx) { - u16 first_inst = ctx->skf->insns[0].code; int sp_off; /* Calculate the total offset for the stack pointer */ @@ -641,7 +627,7 @@ static void build_prologue(struct jit_ctx *ctx) emit_jit_reg_move(r_X, r_zero, ctx); /* Do not leak kernel data to userspace */ - if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst))) + if (bpf_needs_clear_a(>skf->insns[0])) emit_jit_reg_move(r_A, r_zero, ctx); } diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c index 04782164ee67..2d66a8446198 100644 --- a/arch/powerpc/net/bpf_jit_comp.c +++ b/arch/powerpc/net/bpf_jit_comp.c @@ -78,18 +78,9 @@ static void bpf_jit_build_prologue(struct bpf_prog *fp, u32 *image, PPC_LI(r_X, 0); } - switch (filter[0].code) { - case BPF_RET | BPF_K: - case BPF_LD | BPF_W | BPF_LEN: - case BPF_LD | BPF_W | BPF_ABS: - case BPF_LD | BPF_H | BPF_ABS: - case BPF_LD | BPF_B | BPF_ABS: - /* first instruction sets A register (or is RET 'constant') */ - break; - default: - /* make sure we dont leak kernel information to user */ + /* make sure we dont leak kernel information to user */ + if (bpf_needs_clear_a([0])) PPC_LI(r_A, 0); - } } static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx) diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c index 22564f5f2364..3e6e05a7c4c2 100644 --- a/arch/sparc/net/bpf_jit_comp.c +++ b/arch/sparc/net/bpf_jit_comp.c @@ -420,22 +420,9 @@ void bpf_jit_compile(struct bpf_prog *fp) } emit_reg_move(O7, r_saved_O7); - switch (filter[0].code) { - case BPF_RET | BPF_K: - case BPF_LD | BPF_W | BPF_LEN: - case BPF_LD | BPF_W | BPF_ABS: -
Re: [RFC PATCH v2 1/3] PCI: Add support for enforcing all MMIO BARs to be page aligned
On 2016/1/5 4:47, Alex Williamson wrote: On Thu, 2015-12-31 at 16:50 +0800, Yongji Xie wrote: When vfio passthrough a PCI device of which MMIO BARs are smaller than PAGE_SIZE, guest will not handle the mmio accesses to the BARs which leads to mmio emulations in host. This is because vfio will not allow to passthrough one BAR's mmio page which may be shared with other BARs. To solve this performance issue, this patch adds a kernel parameter "pci=resource_page_aligned=on" to enforce the alignments of all MMIO BARs to be at least PAGE_SIZE, so that one BAR's mmio page would not be shared with other BARs. We can also disable it through kernel parameter "pci=resource_page_aligned=off". Shouldn't this somehow be associated with the realloc option? I don't think PCI code will attempt to reprogram anything unless it needs to otherwise. So you mean we need to ignore firmware setup and force re-assigning all resources if we want to use the option "pci=resource_page_aligned=on"? For the default value of this parameter, we think it should be arch-independent, so we add a macro PCI_RESOURCE_PAGE_ALIGNED to change it. And we define this macro to enable this parameter by default on PPC64 platform which can easily hit this performance issue because its PAGE_SIZE is 64KB. Signed-off-by: Yongji Xie--- Documentation/kernel-parameters.txt |4 arch/powerpc/include/asm/pci.h | 11 +++ drivers/pci/pci.c | 17 + drivers/pci/pci.h |7 ++- include/linux/pci.h |2 ++ 5 files changed, 40 insertions(+), 1 deletion(-) diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 742f69d..a53aaee 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -2857,6 +2857,10 @@ bytes respectively. Such letter suffixes can also be entirely omitted. PAGE_SIZE is used as alignment. PCI-PCI bridge can be specified, if resource windows need to be expanded. + resource_page_aligned= Enable/disable enforcing the alignment + of all PCI devices' memory resources to be + at least PAGE_SIZE. + Format: { "on" | "off" } ecrc= Enable/disable PCIe ECRC (transaction layer end-to-end CRC checking). bios: Use BIOS/firmware settings. This is the diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h index 3453bd8..27bff59 100644 --- a/arch/powerpc/include/asm/pci.h +++ b/arch/powerpc/include/asm/pci.h @@ -136,6 +136,17 @@ extern pgprot_tpci_phys_mem_access_prot(struct file *file, unsigned long pfn, unsigned long size, pgprot_t prot); +#ifdef CONFIG_PPC64 + +/* For PPC64, We enforce all PCI MMIO BARs to be page aligned + * by default. This would be helpful to improve performance + * when we passthrough a PCI device of which BARs are smaller + * than PAGE_SIZE(64KB). And we can use bootcmd + * "pci=resource_page_aligned=off" to disable it. + */ +#define PCI_ENABLE_RESOURCE_PAGE_ALIGNED + +#endif This should be done with something like HAVE_PCI_DEFAULT_RESOURCE_PAGE_ ALIGNED in arch/powerpc/include/asm OK, I will fix it in next version. #define HAVE_ARCH_PCI_RESOURCE_TO_USER extern void pci_resource_to_user(const struct pci_dev *dev, int bar, diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 314db8c..9f14ba5 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -99,6 +99,13 @@ u8 pci_cache_line_size; */ unsigned int pcibios_max_latency = 255; +#ifdef PCI_ENABLE_RESOURCE_PAGE_ALIGNED +bool pci_resource_page_aligned = true; +#else +bool pci_resource_page_aligned; +#endif +EXPORT_SYMBOL(pci_resource_page_aligned); Couldn't this be done in a single line with IS_ENABLED() macro? I'm not sure whether IS_ENABLED() macro should be used there because it is always used for CONFIG_ options. Should this symbol be GPL-only? Yes, it will be fixed in next version. + /* If set, the PCIe ARI capability will not be used. */ static bool pcie_ari_disabled; @@ -4746,6 +4753,14 @@ static ssize_t pci_resource_alignment_store(struct bus_type *bus, BUS_ATTR(resource_alignment, 0644, pci_resource_alignment_show, pci_resource_alignment_store); +static void pci_resource_get_page_aligned(char *str) +{ + if (!strncmp(str, "off", 3)) + pci_resource_page_aligned = false; + else if (!strncmp(str, "on", 2)) + pci_resource_page_aligned = true; +} + static int __init pci_resource_alignment_sysfs_init(void) {
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Tue, Jan 05, 2016 at 01:08:52PM +0100, Martin Schwidefsky wrote: > On Tue, 5 Jan 2016 11:30:19 +0200 > "Michael S. Tsirkin"wrote: > > > On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote: > > > On Mon, 4 Jan 2016 22:18:58 +0200 > > > "Michael S. Tsirkin" wrote: > > > > > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote: > > > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > > > > > > This defines __smp_xxx barriers for s390, > > > > > > for use by virtualization. > > > > > > > > > > > > Some smp_xxx barriers are removed as they are > > > > > > defined correctly by asm-generic/barriers.h > > > > > > > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers > > > > > > unconditionally on this architecture. > > > > > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > > > Acked-by: Arnd Bergmann > > > > > > --- > > > > > > arch/s390/include/asm/barrier.h | 15 +-- > > > > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > > > > b/arch/s390/include/asm/barrier.h > > > > > > index c358c31..fbd25b2 100644 > > > > > > --- a/arch/s390/include/asm/barrier.h > > > > > > +++ b/arch/s390/include/asm/barrier.h > > > > > > @@ -26,18 +26,21 @@ > > > > > > #define wmb() barrier() > > > > > > #define dma_rmb() mb() > > > > > > #define dma_wmb() mb() > > > > > > -#define smp_mb() mb() > > > > > > -#define smp_rmb() rmb() > > > > > > -#define smp_wmb() wmb() > > > > > > - > > > > > > -#define smp_store_release(p, v) > > > > > > \ > > > > > > +#define __smp_mb() mb() > > > > > > +#define __smp_rmb()rmb() > > > > > > +#define __smp_wmb()wmb() > > > > > > +#define smp_mb() __smp_mb() > > > > > > +#define smp_rmb() __smp_rmb() > > > > > > +#define smp_wmb() __smp_wmb() > > > > > > > > > > Why define the smp_*mb() primitives here? Would not the inclusion of > > > > > asm-generic/barrier.h do this? > > > > > > > > No because the generic one is a nop on !SMP, this one isn't. > > > > > > > > Pls note this patch is just reordering code without making > > > > functional changes. > > > > And at the moment, on s390 smp_xxx barriers are always non empty. > > > > > > The s390 kernel is SMP to 99.99%, we just didn't bother with a > > > non-smp variant for the memory-barriers. If the generic header > > > is used we'd get the non-smp version for free. It will save a > > > small amount of text space for CONFIG_SMP=n. > > > > OK, so I'll queue a patch to do this then? > > Yes please. OK, I'll add a patch on top in v3. > > Just to make sure: the question would be, are smp_xxx barriers ever used > > in s390 arch specific code to flush in/out memory accesses for > > synchronization with the hypervisor? > > > > I went over s390 arch code and it seems to me the answer is no > > (except of course for virtio). > > Correct. Guest to host communication either uses instructions which > imply a memory barrier or QDIO which uses atomics. And atomics imply a barrier on s390, right? > > But I also see a lot of weirdness on this architecture. > > Mostly historical, s390 actually is one of the easiest architectures in > regard to memory barriers. > > > I found these calls: > > > > arch/s390/include/asm/bitops.h: smp_mb__before_atomic(); > > arch/s390/include/asm/bitops.h: smp_mb(); > > > > Not used in arch specific code so this is likely OK. > > This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb > and smp_mb__before_atomic are used in clear_bit_unlock and > __clear_bit_unlock which are 1:1 copies from the code in > include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs > from the generic implementation. something to keep in mind, but I'd rather not touch bitops at the moment - this patchset is already too big. > > arch/s390/kernel/vdso.c:smp_mb(); > > > > Looking at > > Author: Christian Borntraeger > > Date: Fri Sep 11 16:23:06 2015 +0200 > > > > s390/vdso: use correct memory barrier > > > > By definition smp_wmb only orders writes against writes. (Finish all > > previous writes, and do not start any future write). To protect the > > vdso init code against early reads on other CPUs, let's use a full > > smp_mb at the end of vdso init. As right now smp_wmb is implemented > > as full serialization, this needs no stable backport, but this > > change > > will be necessary if we reimplement smp_wmb. > > > > ok from hypervisor point of view, but it's also strange: > > 1. why isn't this
[PATCH 1/2] powerpc: sparse: static-ify some things
As sparse suggests, these should be made static. Signed-off-by: Daniel Axtens--- These are random fixes in arch/powerpc/kernel: there's no real pattern to them. It doesn't fix everything. --- arch/powerpc/kernel/eeh_event.c | 2 +- arch/powerpc/kernel/ibmebus.c | 2 +- arch/powerpc/kernel/mce.c | 2 +- arch/powerpc/kernel/rtasd.c | 2 +- arch/powerpc/kernel/vio.c | 4 ++-- 5 files changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c index 4eefb6e34dbb..82e7327e3cd0 100644 --- a/arch/powerpc/kernel/eeh_event.c +++ b/arch/powerpc/kernel/eeh_event.c @@ -36,7 +36,7 @@ static DEFINE_SPINLOCK(eeh_eventlist_lock); static struct semaphore eeh_eventlist_sem; -LIST_HEAD(eeh_eventlist); +static LIST_HEAD(eeh_eventlist); /** * eeh_event_handler - Dispatch EEH events. diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c index ac86c53e2542..a89f4f7a66bd 100644 --- a/arch/powerpc/kernel/ibmebus.c +++ b/arch/powerpc/kernel/ibmebus.c @@ -408,7 +408,7 @@ static ssize_t modalias_show(struct device *dev, return len+1; } -struct device_attribute ibmebus_bus_device_attrs[] = { +static struct device_attribute ibmebus_bus_device_attrs[] = { __ATTR_RO(devspec), __ATTR_RO(name), __ATTR_RO(modalias), diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index b2eb4686bd8f..35138225af6e 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -37,7 +37,7 @@ static DEFINE_PER_CPU(int, mce_queue_count); static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue); static void machine_check_process_queued_event(struct irq_work *work); -struct irq_work mce_event_process_work = { +static struct irq_work mce_event_process_work = { .func = machine_check_process_queued_event, }; diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c index 5a2c049c1c61..c07f4b665b6b 100644 --- a/arch/powerpc/kernel/rtasd.c +++ b/arch/powerpc/kernel/rtasd.c @@ -442,7 +442,7 @@ static void do_event_scan(void) } static void rtas_event_scan(struct work_struct *w); -DECLARE_DELAYED_WORK(event_scan_work, rtas_event_scan); +static DECLARE_DELAYED_WORK(event_scan_work, rtas_event_scan); /* * Delay should be at least one second since some machines have problems if diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c index 5f8dcdaa2820..8d7358f3a273 100644 --- a/arch/powerpc/kernel/vio.c +++ b/arch/powerpc/kernel/vio.c @@ -87,7 +87,7 @@ struct vio_cmo_dev_entry { * @curr: bytes currently allocated * @high: high water mark for IO data usage */ -struct vio_cmo { +static struct vio_cmo { spinlock_t lock; struct delayed_work balance_q; struct list_head device_list; @@ -615,7 +615,7 @@ static u64 vio_dma_get_required_mask(struct device *dev) return dma_iommu_ops.get_required_mask(dev); } -struct dma_map_ops vio_dma_mapping_ops = { +static struct dma_map_ops vio_dma_mapping_ops = { .alloc = vio_dma_iommu_alloc_coherent, .free = vio_dma_iommu_free_coherent, .mmap = dma_direct_mmap_coherent, -- 2.6.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] powerpc: sparse: Include headers for __weak symbols
Sometimes when sparse warns about undefined symbols, it isn't because they should have 'static' added, it's because they're overriding __weak symbols defined elsewhere, and the header has been missed. Fix a few of them by adding appropriate headers. Signed-off-by: Daniel Axtens--- arch/powerpc/kernel/process.c | 2 ++ arch/powerpc/kernel/prom.c| 1 + arch/powerpc/kernel/time.c| 1 + arch/powerpc/mm/mmap.c| 1 + 4 files changed, 5 insertions(+) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index dccc87e8fee5..afd81390add0 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include @@ -55,6 +56,7 @@ #include #endif #include +#include #include #include diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 7030b035905d..1b082c729c29 100644 --- a/arch/powerpc/kernel/prom.c +++ b/arch/powerpc/kernel/prom.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c index 81b0900a39ee..3ed9a5a21d77 100644 --- a/arch/powerpc/kernel/time.c +++ b/arch/powerpc/kernel/time.c @@ -55,6 +55,7 @@ #include #include #include +#include #include #include diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c index 0f0502e12f6c..1ccb2f6ecaf9 100644 --- a/arch/powerpc/mm/mmap.c +++ b/arch/powerpc/mm/mmap.c @@ -26,6 +26,7 @@ #include #include #include +#include /* * Top of mmap area (just below the process stack). -- 2.6.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X
On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote: > On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote: > > The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data > > instructions since it XORs A with X while all the others replace A with > > some loaded value. All the BPF JITs fail to clear A if this is used as > > the first instruction in a filter. > > Is x86_64 part of this 'All' subset ? ;) No, because it's an eBPF JIT. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 12/32] x86/um: reuse asm-generic/barrier.h
Am 31.12.2015 um 20:07 schrieb Michael S. Tsirkin: > On x86/um CONFIG_SMP is never defined. As a result, several macros > match the asm-generic variant exactly. Drop the local definitions and > pull in asm-generic/barrier.h instead. > > This is in preparation to refactoring this code area. > > Signed-off-by: Michael S. Tsirkin> Acked-by: Arnd Bergmann Acked-by: Richard Weinberger Thanks, //richard ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v2 22/32] s390: define __smp_xxx
On Tue, 5 Jan 2016 11:30:19 +0200 "Michael S. Tsirkin"wrote: > On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote: > > On Mon, 4 Jan 2016 22:18:58 +0200 > > "Michael S. Tsirkin" wrote: > > > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote: > > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote: > > > > > This defines __smp_xxx barriers for s390, > > > > > for use by virtualization. > > > > > > > > > > Some smp_xxx barriers are removed as they are > > > > > defined correctly by asm-generic/barriers.h > > > > > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers > > > > > unconditionally on this architecture. > > > > > > > > > > Signed-off-by: Michael S. Tsirkin > > > > > Acked-by: Arnd Bergmann > > > > > --- > > > > > arch/s390/include/asm/barrier.h | 15 +-- > > > > > 1 file changed, 9 insertions(+), 6 deletions(-) > > > > > > > > > > diff --git a/arch/s390/include/asm/barrier.h > > > > > b/arch/s390/include/asm/barrier.h > > > > > index c358c31..fbd25b2 100644 > > > > > --- a/arch/s390/include/asm/barrier.h > > > > > +++ b/arch/s390/include/asm/barrier.h > > > > > @@ -26,18 +26,21 @@ > > > > > #define wmb()barrier() > > > > > #define dma_rmb()mb() > > > > > #define dma_wmb()mb() > > > > > -#define smp_mb() mb() > > > > > -#define smp_rmb()rmb() > > > > > -#define smp_wmb()wmb() > > > > > - > > > > > -#define smp_store_release(p, v) > > > > > \ > > > > > +#define __smp_mb() mb() > > > > > +#define __smp_rmb() rmb() > > > > > +#define __smp_wmb() wmb() > > > > > +#define smp_mb() __smp_mb() > > > > > +#define smp_rmb()__smp_rmb() > > > > > +#define smp_wmb()__smp_wmb() > > > > > > > > Why define the smp_*mb() primitives here? Would not the inclusion of > > > > asm-generic/barrier.h do this? > > > > > > No because the generic one is a nop on !SMP, this one isn't. > > > > > > Pls note this patch is just reordering code without making > > > functional changes. > > > And at the moment, on s390 smp_xxx barriers are always non empty. > > > > The s390 kernel is SMP to 99.99%, we just didn't bother with a > > non-smp variant for the memory-barriers. If the generic header > > is used we'd get the non-smp version for free. It will save a > > small amount of text space for CONFIG_SMP=n. > > OK, so I'll queue a patch to do this then? Yes please. > Just to make sure: the question would be, are smp_xxx barriers ever used > in s390 arch specific code to flush in/out memory accesses for > synchronization with the hypervisor? > > I went over s390 arch code and it seems to me the answer is no > (except of course for virtio). Correct. Guest to host communication either uses instructions which imply a memory barrier or QDIO which uses atomics. > But I also see a lot of weirdness on this architecture. Mostly historical, s390 actually is one of the easiest architectures in regard to memory barriers. > I found these calls: > > arch/s390/include/asm/bitops.h: smp_mb__before_atomic(); > arch/s390/include/asm/bitops.h: smp_mb(); > > Not used in arch specific code so this is likely OK. This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb and smp_mb__before_atomic are used in clear_bit_unlock and __clear_bit_unlock which are 1:1 copies from the code in include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs from the generic implementation. > arch/s390/kernel/vdso.c:smp_mb(); > > Looking at > Author: Christian Borntraeger > Date: Fri Sep 11 16:23:06 2015 +0200 > > s390/vdso: use correct memory barrier > > By definition smp_wmb only orders writes against writes. (Finish all > previous writes, and do not start any future write). To protect the > vdso init code against early reads on other CPUs, let's use a full > smp_mb at the end of vdso init. As right now smp_wmb is implemented > as full serialization, this needs no stable backport, but this > change > will be necessary if we reimplement smp_wmb. > > ok from hypervisor point of view, but it's also strange: > 1. why isn't this paired with another mb somewhere? >this seems to violate barrier pairing rules. > 2. how does smp_mb protect against early reads on other CPUs? >It normally does not: it orders reads from this CPU versus writes >from same CPU. But init code does not appear to read anything. >Maybe this is some s390 specific trick? > > I could not figure out the above commit. That smp_mb can be removed. The initial s390 vdso code is heavily
Re: [PATCH v2 15/32] powerpc: define __smp_xxx
On Tue, Jan 05, 2016 at 06:16:48PM +0200, Michael S. Tsirkin wrote: [snip] > > > > Another thing is that smp_lwsync() may have a third user(other than > > > > smp_load_acquire() and smp_store_release()): > > > > > > > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877 > > > > > > > > I'm OK to change my patch accordingly, but do we really want > > > > smp_lwsync() get involved in this cleanup? If I understand you > > > > correctly, this cleanup focuses on external API like smp_{r,w,}mb(), > > > > while smp_lwsync() is internal to PPC. > > > > > > > > Regards, > > > > Boqun > > > > > > I think you missed the leading ___ :) > > > > > > > What I mean here was smp_lwsync() was originally internal to PPC, but > > never mind ;-) > > > > > smp_store_release is external and it needs __smp_lwsync as > > > defined here. > > > > > > I can duplicate some code and have smp_lwsync *not* call __smp_lwsync > > > > You mean bringing smp_lwsync() back? because I haven't seen you defining > > in asm-generic/barriers.h in previous patches and you just delete it in > > this patch. > > > > > but why do this? Still, if you prefer it this way, > > > please let me know. > > > > > > > I think deleting smp_lwsync() is fine, though I need to change atomic > > variants patches on PPC because of it ;-/ > > > > Regards, > > Boqun > > Sorry, I don't understand - why do you have to do anything? > I changed all users of smp_lwsync so they > use __smp_lwsync on SMP and barrier() on !SMP. > > This is exactly the current behaviour, I also tested that > generated code does not change at all. > > Is there a patch in your tree that conflicts with this? > Because in a patchset which implements atomic relaxed/acquire/release variants on PPC I use smp_lwsync(), this makes it have another user, please see this mail: http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877 in definition of PPC's __atomic_op_release(). But I think removing smp_lwsync() is a good idea and actually I think we can go further to remove __smp_lwsync() and let __smp_load_acquire and __smp_store_release call __lwsync() directly, but that is another thing. Anyway, I will modify my patch. Regards, Boqun > > > > > > WRITE_ONCE(*p, v); > > > > > \ > > > > > } while (0) > > > > > > > > > > -#define smp_load_acquire(p) > > > > > \ > > > > > +#define __smp_load_acquire(p) > > > > > \ > > > > > ({ > > > > > \ > > > > > typeof(*p) ___p1 = READ_ONCE(*p); > > > > > \ > > > > > compiletime_assert_atomic_type(*p); > > > > > \ > > > > > - smp_lwsync(); > > > > > \ > > > > > + __smp_lwsync(); > > > > > \ > > > > > ___p1; > > > > > \ > > > > > }) > > > > > > > > > > -- > > > > > MST > > > > > > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe > > > > > linux-kernel" in > > > > > the body of a message to majord...@vger.kernel.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > Please read the FAQ at http://www.tux.org/lkml/ ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH v6 2/4] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
Hi all, I will resend this one to avoid a potential conflict with: http://article.gmane.org/gmane.linux.kernel/2116880 by open coding smp_lwsync() with: __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory"); Regards, Boqun signature.asc Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH RESEND v6 2/4] powerpc: atomic: Implement atomic{, 64}_*_return_* variants
On powerpc, acquire and release semantics can be achieved with lightweight barriers("lwsync" and "ctrl+isync"), which can be used to implement __atomic_op_{acquire,release}. For release semantics, since we only need to ensure all memory accesses that issue before must take effects before the -store- part of the atomics, "lwsync" is what we only need. On the platform without "lwsync", "sync" should be used. Therefore in __atomic_op_release() we use PPC_RELEASE_BARRIER. For acquire semantics, "lwsync" is what we only need for the similar reason. However on the platform without "lwsync", we can use "isync" rather than "sync" as an acquire barrier. Therefore in __atomic_op_acquire() we use PPC_ACQUIRE_BARRIER, which is barrier() on UP, "lwsync" if available and "isync" otherwise. Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other variants with these helpers. Signed-off-by: Boqun Feng--- arch/powerpc/include/asm/atomic.h | 147 ++ 1 file changed, 85 insertions(+), 62 deletions(-) diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 55f106e..a35c277 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -12,6 +12,24 @@ #define ATOMIC_INIT(i) { (i) } +/* + * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with + * a "bne-" instruction at the end, so an isync is enough as a acquire barrier + * on the platform without lwsync. + */ +#define __atomic_op_acquire(op, args...) \ +({ \ + typeof(op##_relaxed(args)) __ret = op##_relaxed(args); \ + __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");\ + __ret; \ +}) + +#define __atomic_op_release(op, args...) \ +({ \ + __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");\ + op##_relaxed(args); \ +}) + static __inline__ int atomic_read(const atomic_t *v) { int t; @@ -42,27 +60,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v) \ : "cc");\ } \ -#define ATOMIC_OP_RETURN(op, asm_op) \ -static __inline__ int atomic_##op##_return(int a, atomic_t *v) \ +#define ATOMIC_OP_RETURN_RELAXED(op, asm_op) \ +static inline int atomic_##op##_return_relaxed(int a, atomic_t *v) \ { \ int t; \ \ __asm__ __volatile__( \ - PPC_ATOMIC_ENTRY_BARRIER\ -"1:lwarx %0,0,%2 # atomic_" #op "_return\n" \ - #asm_op " %0,%1,%0\n" \ - PPC405_ERR77(0,%2) \ -" stwcx. %0,0,%2 \n" \ +"1:lwarx %0,0,%3 # atomic_" #op "_return_relaxed\n" \ + #asm_op " %0,%2,%0\n" \ + PPC405_ERR77(0, %3) \ +" stwcx. %0,0,%3\n" \ " bne-1b\n" \ - PPC_ATOMIC_EXIT_BARRIER \ - : "=" (t) \ + : "=" (t), "+m" (v->counter) \ : "r" (a), "r" (>counter)\ - : "cc", "memory"); \ + : "cc");\ \ return t; \ } -#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, asm_op) +#define ATOMIC_OPS(op, asm_op) \ + ATOMIC_OP(op, asm_op) \ + ATOMIC_OP_RETURN_RELAXED(op, asm_op) ATOMIC_OPS(add, add) ATOMIC_OPS(sub, subf) @@ -71,8 +89,11 @@ ATOMIC_OP(and, and) ATOMIC_OP(or, or) ATOMIC_OP(xor, xor) +#define atomic_add_return_relaxed atomic_add_return_relaxed +#define atomic_sub_return_relaxed atomic_sub_return_relaxed + #undef ATOMIC_OPS
Re: [PATCH 1/2] powerpc: sparse: static-ify some things
On 06/01/16 11:45, Daniel Axtens wrote: As sparse suggests, these should be made static. Signed-off-by: Daniel AxtensReviewed-by: Andrew Donnellan -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev