Re: [PATCH v2 06/32] s390: reuse asm-generic/barrier.h

2016-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2016 22:42:44 +0200
"Michael S. Tsirkin"  wrote:

> On Mon, Jan 04, 2016 at 04:03:39PM +0100, Martin Schwidefsky wrote:
> > On Mon, 4 Jan 2016 14:20:42 +0100
> > Peter Zijlstra  wrote:
> > 
> > > On Thu, Dec 31, 2015 at 09:06:30PM +0200, Michael S. Tsirkin wrote:
> > > > On s390 read_barrier_depends, smp_read_barrier_depends
> > > > smp_store_mb(), smp_mb__before_atomic and smp_mb__after_atomic match the
> > > > asm-generic variants exactly. Drop the local definitions and pull in
> > > > asm-generic/barrier.h instead.
> > > > 
> > > > This is in preparation to refactoring this code area.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > Acked-by: Arnd Bergmann 
> > > > ---
> > > >  arch/s390/include/asm/barrier.h | 10 ++
> > > >  1 file changed, 2 insertions(+), 8 deletions(-)
> > > > 
> > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > b/arch/s390/include/asm/barrier.h
> > > > index 7ffd0b1..c358c31 100644
> > > > --- a/arch/s390/include/asm/barrier.h
> > > > +++ b/arch/s390/include/asm/barrier.h
> > > > @@ -30,14 +30,6 @@
> > > >  #define smp_rmb()  rmb()
> > > >  #define smp_wmb()  wmb()
> > > >  
> > > > -#define read_barrier_depends() do { } while (0)
> > > > -#define smp_read_barrier_depends() do { } while (0)
> > > > -
> > > > -#define smp_mb__before_atomic()smp_mb()
> > > > -#define smp_mb__after_atomic() smp_mb()
> > > 
> > > As per:
> > > 
> > >   lkml.kernel.org/r/20150921112252.3c2937e1@mschwide
> > > 
> > > s390 should change this to barrier() instead of smp_mb() and hence
> > > should not use the generic versions.
> >  
> > Yes, we wanted to simplify this. Thanks for the reminder, I'll queue
> > a patch.
> 
> Could you base on my patchset maybe, to avoid conflicts,
> and I'll merge it?
> Or if it's just replacing these 2 with barrier() I can do this
> myself easily.

Probably the easiest solution if you do the patch yourself and
include it in your patch set. 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Martin Schwidefsky
On Mon, 4 Jan 2016 22:18:58 +0200
"Michael S. Tsirkin"  wrote:

> On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > This defines __smp_xxx barriers for s390,
> > > for use by virtualization.
> > > 
> > > Some smp_xxx barriers are removed as they are
> > > defined correctly by asm-generic/barriers.h
> > > 
> > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > unconditionally on this architecture.
> > > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > Acked-by: Arnd Bergmann 
> > > ---
> > >  arch/s390/include/asm/barrier.h | 15 +--
> > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/arch/s390/include/asm/barrier.h 
> > > b/arch/s390/include/asm/barrier.h
> > > index c358c31..fbd25b2 100644
> > > --- a/arch/s390/include/asm/barrier.h
> > > +++ b/arch/s390/include/asm/barrier.h
> > > @@ -26,18 +26,21 @@
> > >  #define wmb()barrier()
> > >  #define dma_rmb()mb()
> > >  #define dma_wmb()mb()
> > > -#define smp_mb() mb()
> > > -#define smp_rmb()rmb()
> > > -#define smp_wmb()wmb()
> > > -
> > > -#define smp_store_release(p, v)  
> > > \
> > > +#define __smp_mb()   mb()
> > > +#define __smp_rmb()  rmb()
> > > +#define __smp_wmb()  wmb()
> > > +#define smp_mb() __smp_mb()
> > > +#define smp_rmb()__smp_rmb()
> > > +#define smp_wmb()__smp_wmb()
> > 
> > Why define the smp_*mb() primitives here? Would not the inclusion of
> > asm-generic/barrier.h do this?
> 
> No because the generic one is a nop on !SMP, this one isn't.
> 
> Pls note this patch is just reordering code without making
> functional changes.
> And at the moment, on s390 smp_xxx barriers are always non empty.

The s390 kernel is SMP to 99.99%, we just didn't bother with a
non-smp variant for the memory-barriers. If the generic header
is used we'd get the non-smp version for free. It will save a
small amount of text space for CONFIG_SMP=n. 
 
> Some of this could be sub-optimal, but
> since on s390 Linux always runs on a hypervisor,
> I am not sure it's safe to use the generic version -
> in other words, it just might be that for s390 smp_ and virt_
> barriers must be equivalent.

The definition of the memory barriers is independent from the fact
if the system is running on an hypervisor or not. Is there really
an architecture where you need special virt_xxx barriers?!? 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote:
> Hi Michael,
> 
> On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote:
> > This defines __smp_xxx barriers for powerpc
> > for use by virtualization.
> > 
> > smp_xxx barriers are removed as they are
> > defined correctly by asm-generic/barriers.h

I think this is the part that was missed in review.

> > This reduces the amount of arch-specific boiler-plate code.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > Acked-by: Arnd Bergmann 
> > ---
> >  arch/powerpc/include/asm/barrier.h | 24 
> >  1 file changed, 8 insertions(+), 16 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/barrier.h 
> > b/arch/powerpc/include/asm/barrier.h
> > index 980ad0c..c0deafc 100644
> > --- a/arch/powerpc/include/asm/barrier.h
> > +++ b/arch/powerpc/include/asm/barrier.h
> > @@ -44,19 +44,11 @@
> >  #define dma_rmb()  __lwsync()
> >  #define dma_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
> > :"memory")
> >  
> > -#ifdef CONFIG_SMP
> > -#define smp_lwsync()   __lwsync()
> > +#define __smp_lwsync() __lwsync()
> >  
> 
> so __smp_lwsync() is always mapped to lwsync, right?

Yes.

> > -#define smp_mb()   mb()
> > -#define smp_rmb()  __lwsync()
> > -#define smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
> > :"memory")
> > -#else
> > -#define smp_lwsync()   barrier()
> > -
> > -#define smp_mb()   barrier()
> > -#define smp_rmb()  barrier()
> > -#define smp_wmb()  barrier()
> > -#endif /* CONFIG_SMP */
> > +#define __smp_mb() mb()
> > +#define __smp_rmb()__lwsync()
> > +#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > : :"memory")
> >  
> >  /*
> >   * This is a barrier which prevents following instructions from being
> > @@ -67,18 +59,18 @@
> >  #define data_barrier(x)\
> > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> >  
> > -#define smp_store_release(p, v)
> > \
> > +#define __smp_store_release(p, v)  
> > \
> >  do {   
> > \
> > compiletime_assert_atomic_type(*p); \
> > -   smp_lwsync();   \
> > +   __smp_lwsync(); \
> 
> , therefore this will emit an lwsync no matter SMP or UP.

Absolutely. But smp_store_release (without __) will not.

Please note I did test this: for ppc code before and after
this patch generates exactly the same binary on SMP and UP.


> Another thing is that smp_lwsync() may have a third user(other than
> smp_load_acquire() and smp_store_release()):
> 
> http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> 
> I'm OK to change my patch accordingly, but do we really want
> smp_lwsync() get involved in this cleanup? If I understand you
> correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> while smp_lwsync() is internal to PPC.
> 
> Regards,
> Boqun

I think you missed the leading ___ :)

smp_store_release is external and it needs __smp_lwsync as
defined here.

I can duplicate some code and have smp_lwsync *not* call __smp_lwsync
but why do this? Still, if you prefer it this way,
please let me know.

> > WRITE_ONCE(*p, v);  \
> >  } while (0)
> >  
> > -#define smp_load_acquire(p)
> > \
> > +#define __smp_load_acquire(p)  
> > \
> >  ({ \
> > typeof(*p) ___p1 = READ_ONCE(*p);   \
> > compiletime_assert_atomic_type(*p); \
> > -   smp_lwsync();   \
> > +   __smp_lwsync(); \
> > ___p1;  \
> >  })
> >  
> > -- 
> > MST
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote:
> On Mon, 4 Jan 2016 22:18:58 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > > This defines __smp_xxx barriers for s390,
> > > > for use by virtualization.
> > > > 
> > > > Some smp_xxx barriers are removed as they are
> > > > defined correctly by asm-generic/barriers.h
> > > > 
> > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > > unconditionally on this architecture.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > Acked-by: Arnd Bergmann 
> > > > ---
> > > >  arch/s390/include/asm/barrier.h | 15 +--
> > > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > > 
> > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > b/arch/s390/include/asm/barrier.h
> > > > index c358c31..fbd25b2 100644
> > > > --- a/arch/s390/include/asm/barrier.h
> > > > +++ b/arch/s390/include/asm/barrier.h
> > > > @@ -26,18 +26,21 @@
> > > >  #define wmb()  barrier()
> > > >  #define dma_rmb()  mb()
> > > >  #define dma_wmb()  mb()
> > > > -#define smp_mb()   mb()
> > > > -#define smp_rmb()  rmb()
> > > > -#define smp_wmb()  wmb()
> > > > -
> > > > -#define smp_store_release(p, v)
> > > > \
> > > > +#define __smp_mb() mb()
> > > > +#define __smp_rmb()rmb()
> > > > +#define __smp_wmb()wmb()
> > > > +#define smp_mb()   __smp_mb()
> > > > +#define smp_rmb()  __smp_rmb()
> > > > +#define smp_wmb()  __smp_wmb()
> > > 
> > > Why define the smp_*mb() primitives here? Would not the inclusion of
> > > asm-generic/barrier.h do this?
> > 
> > No because the generic one is a nop on !SMP, this one isn't.
> > 
> > Pls note this patch is just reordering code without making
> > functional changes.
> > And at the moment, on s390 smp_xxx barriers are always non empty.
> 
> The s390 kernel is SMP to 99.99%, we just didn't bother with a
> non-smp variant for the memory-barriers. If the generic header
> is used we'd get the non-smp version for free. It will save a
> small amount of text space for CONFIG_SMP=n. 

OK, so I'll queue a patch to do this then?

Just to make sure: the question would be, are smp_xxx barriers ever used
in s390 arch specific code to flush in/out memory accesses for
synchronization with the hypervisor?

I went over s390 arch code and it seems to me the answer is no
(except of course for virtio).

But I also see a lot of weirdness on this architecture.

I found these calls:

arch/s390/include/asm/bitops.h: smp_mb__before_atomic();
arch/s390/include/asm/bitops.h: smp_mb();

Not used in arch specific code so this is likely OK.

arch/s390/kernel/vdso.c:smp_mb();

Looking at
Author: Christian Borntraeger 
Date:   Fri Sep 11 16:23:06 2015 +0200

s390/vdso: use correct memory barrier

By definition smp_wmb only orders writes against writes. (Finish all
previous writes, and do not start any future write). To protect the
vdso init code against early reads on other CPUs, let's use a full
smp_mb at the end of vdso init. As right now smp_wmb is implemented
as full serialization, this needs no stable backport, but this 
change
will be necessary if we reimplement smp_wmb.

ok from hypervisor point of view, but it's also strange:
1. why isn't this paired with another mb somewhere?
   this seems to violate barrier pairing rules.
2. how does smp_mb protect against early reads on other CPUs?
   It normally does not: it orders reads from this CPU versus writes
   from same CPU. But init code does not appear to read anything.
   Maybe this is some s390 specific trick?

I could not figure out the above commit.


arch/s390/kvm/kvm-s390.c:   smp_mb();

Does not appear to be paired with anything.


arch/s390/lib/spinlock.c:   smp_mb();
arch/s390/lib/spinlock.c:   smp_mb();

Seems ok, and appears paired properly.
Just to make sure - spinlock is not paravirtualized on s390, is it?

rch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();
arch/s390/kernel/time.c:smp_wmb();

It's all around vdso, so I'm guessing userspace is using this,
this is why there's no pairing.



> > Some of this could be sub-optimal, but
> > since on s390 Linux always runs on a hypervisor,
> > I am not sure it's safe to use the generic version -
> > in other words, it just might be that for s390 smp_ and virt_
> > barriers must be equivalent.
> 

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Boqun Feng
On Tue, Jan 05, 2016 at 10:51:17AM +0200, Michael S. Tsirkin wrote:
> On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote:
> > Hi Michael,
> > 
> > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote:
> > > This defines __smp_xxx barriers for powerpc
> > > for use by virtualization.
> > > 
> > > smp_xxx barriers are removed as they are
> > > defined correctly by asm-generic/barriers.h
> 
> I think this is the part that was missed in review.
> 

Yes, I realized my mistake after reread the series. But smp_lwsync() is
not defined in asm-generic/barriers.h, right?

> > > This reduces the amount of arch-specific boiler-plate code.
> > > 
> > > Signed-off-by: Michael S. Tsirkin 
> > > Acked-by: Arnd Bergmann 
> > > ---
> > >  arch/powerpc/include/asm/barrier.h | 24 
> > >  1 file changed, 8 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/include/asm/barrier.h 
> > > b/arch/powerpc/include/asm/barrier.h
> > > index 980ad0c..c0deafc 100644
> > > --- a/arch/powerpc/include/asm/barrier.h
> > > +++ b/arch/powerpc/include/asm/barrier.h
> > > @@ -44,19 +44,11 @@
> > >  #define dma_rmb()__lwsync()
> > >  #define dma_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > >  
> > > -#ifdef CONFIG_SMP
> > > -#define smp_lwsync() __lwsync()
> > > +#define __smp_lwsync()   __lwsync()
> > >  
> > 
> > so __smp_lwsync() is always mapped to lwsync, right?
> 
> Yes.
> 
> > > -#define smp_mb() mb()
> > > -#define smp_rmb()__lwsync()
> > > -#define smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > > -#else
> > > -#define smp_lwsync() barrier()
> > > -
> > > -#define smp_mb() barrier()
> > > -#define smp_rmb()barrier()
> > > -#define smp_wmb()barrier()
> > > -#endif /* CONFIG_SMP */
> > > +#define __smp_mb()   mb()
> > > +#define __smp_rmb()  __lwsync()
> > > +#define __smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > : :"memory")
> > >  
> > >  /*
> > >   * This is a barrier which prevents following instructions from being
> > > @@ -67,18 +59,18 @@
> > >  #define data_barrier(x)  \
> > >   asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> > >  
> > > -#define smp_store_release(p, v)  
> > > \
> > > +#define __smp_store_release(p, v)
> > > \
> > >  do { 
> > > \
> > >   compiletime_assert_atomic_type(*p); \
> > > - smp_lwsync();   \
> > > + __smp_lwsync(); \
> > 
> > , therefore this will emit an lwsync no matter SMP or UP.
> 
> Absolutely. But smp_store_release (without __) will not.
> 
> Please note I did test this: for ppc code before and after
> this patch generates exactly the same binary on SMP and UP.
> 

Yes, you're right, sorry for my mistake...

> 
> > Another thing is that smp_lwsync() may have a third user(other than
> > smp_load_acquire() and smp_store_release()):
> > 
> > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> > 
> > I'm OK to change my patch accordingly, but do we really want
> > smp_lwsync() get involved in this cleanup? If I understand you
> > correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> > while smp_lwsync() is internal to PPC.
> > 
> > Regards,
> > Boqun
> 
> I think you missed the leading ___ :)
> 

What I mean here was smp_lwsync() was originally internal to PPC, but
never mind ;-)

> smp_store_release is external and it needs __smp_lwsync as
> defined here.
> 
> I can duplicate some code and have smp_lwsync *not* call __smp_lwsync

You mean bringing smp_lwsync() back? because I haven't seen you defining
in asm-generic/barriers.h in previous patches and you just delete it in
this patch.

> but why do this? Still, if you prefer it this way,
> please let me know.
> 

I think deleting smp_lwsync() is fine, though I need to change atomic
variants patches on PPC because of it ;-/

Regards,
Boqun

> > >   WRITE_ONCE(*p, v);  \
> > >  } while (0)
> > >  
> > > -#define smp_load_acquire(p)  
> > > \
> > > +#define __smp_load_acquire(p)
> > > \
> > >  ({   
> > > \
> > >   typeof(*p) ___p1 = READ_ONCE(*p);   \
> > >   compiletime_assert_atomic_type(*p); \
> > > - smp_lwsync();   \
> > > + __smp_lwsync(); \
> > >   ___p1;  \
> > >  })
> > >  
> > > -- 
> > 

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Eric Dumazet
On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:
> The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> instructions since it XORs A with X while all the others replace A with
> some loaded value.  All the BPF JITs fail to clear A if this is used as
> the first instruction in a filter.

Is x86_64 part of this 'All' subset ? ;)

>   This was found using american fuzzy
> lop.
> 
> Add a helper to determine if A needs to be cleared given the first
> instruction in a filter, and use this in the JITs.  Except for ARM, the
> rest have only been compile-tested.
> 
> Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
> Signed-off-by: Rabin Vincent 
> ---


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Alexei Starovoitov
On Tue, Jan 05, 2016 at 05:36:47PM +0100, Daniel Borkmann wrote:
> On 01/05/2016 04:23 PM, Rabin Vincent wrote:
> >The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> >instructions since it XORs A with X while all the others replace A with
> >some loaded value.  All the BPF JITs fail to clear A if this is used as
> >the first instruction in a filter.  This was found using american fuzzy
> >lop.
> >
> >Add a helper to determine if A needs to be cleared given the first
> >instruction in a filter, and use this in the JITs.  Except for ARM, the
> >rest have only been compile-tested.
> >
> >Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
> >Signed-off-by: Rabin Vincent 
> 
> Excellent catch, thanks a lot! The fix looks good to me and should
> go to -net tree.
> 
> Acked-by: Daniel Borkmann 

good catch indeed.
Classic bpf jits didn't have much love. Great to see this work.

Acked-by: Alexei Starovoitov 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 05:53:41PM +0800, Boqun Feng wrote:
> On Tue, Jan 05, 2016 at 10:51:17AM +0200, Michael S. Tsirkin wrote:
> > On Tue, Jan 05, 2016 at 09:36:55AM +0800, Boqun Feng wrote:
> > > Hi Michael,
> > > 
> > > On Thu, Dec 31, 2015 at 09:07:42PM +0200, Michael S. Tsirkin wrote:
> > > > This defines __smp_xxx barriers for powerpc
> > > > for use by virtualization.
> > > > 
> > > > smp_xxx barriers are removed as they are
> > > > defined correctly by asm-generic/barriers.h
> > 
> > I think this is the part that was missed in review.
> > 
> 
> Yes, I realized my mistake after reread the series. But smp_lwsync() is
> not defined in asm-generic/barriers.h, right?

It isn't because as far as I could tell it is not used
outside arch/powerpc/include/asm/barrier.h
smp_store_release and smp_load_acquire.

And these are now gone.

Instead there are __smp_store_release and __smp_load_acquire
which call __smp_lwsync.
These are only used for virt and on SMP.
UP variants are generic - they just call barrier().


> > > > This reduces the amount of arch-specific boiler-plate code.
> > > > 
> > > > Signed-off-by: Michael S. Tsirkin 
> > > > Acked-by: Arnd Bergmann 
> > > > ---
> > > >  arch/powerpc/include/asm/barrier.h | 24 
> > > >  1 file changed, 8 insertions(+), 16 deletions(-)
> > > > 
> > > > diff --git a/arch/powerpc/include/asm/barrier.h 
> > > > b/arch/powerpc/include/asm/barrier.h
> > > > index 980ad0c..c0deafc 100644
> > > > --- a/arch/powerpc/include/asm/barrier.h
> > > > +++ b/arch/powerpc/include/asm/barrier.h
> > > > @@ -44,19 +44,11 @@
> > > >  #define dma_rmb()  __lwsync()
> > > >  #define dma_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > > : :"memory")
> > > >  
> > > > -#ifdef CONFIG_SMP
> > > > -#define smp_lwsync()   __lwsync()
> > > > +#define __smp_lwsync() __lwsync()
> > > >  
> > > 
> > > so __smp_lwsync() is always mapped to lwsync, right?
> > 
> > Yes.
> > 
> > > > -#define smp_mb()   mb()
> > > > -#define smp_rmb()  __lwsync()
> > > > -#define smp_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > > : :"memory")
> > > > -#else
> > > > -#define smp_lwsync()   barrier()
> > > > -
> > > > -#define smp_mb()   barrier()
> > > > -#define smp_rmb()  barrier()
> > > > -#define smp_wmb()  barrier()
> > > > -#endif /* CONFIG_SMP */
> > > > +#define __smp_mb() mb()
> > > > +#define __smp_rmb()__lwsync()
> > > > +#define __smp_wmb()__asm__ __volatile__ (stringify_in_c(SMPWMB) : 
> > > > : :"memory")
> > > >  
> > > >  /*
> > > >   * This is a barrier which prevents following instructions from being
> > > > @@ -67,18 +59,18 @@
> > > >  #define data_barrier(x)\
> > > > asm volatile("twi 0,%0,0; isync" : : "r" (x) : "memory");
> > > >  
> > > > -#define smp_store_release(p, v)
> > > > \
> > > > +#define __smp_store_release(p, v)  
> > > > \
> > > >  do {   
> > > > \
> > > > compiletime_assert_atomic_type(*p); 
> > > > \
> > > > -   smp_lwsync();   
> > > > \
> > > > +   __smp_lwsync(); 
> > > > \
> > > 
> > > , therefore this will emit an lwsync no matter SMP or UP.
> > 
> > Absolutely. But smp_store_release (without __) will not.
> > 
> > Please note I did test this: for ppc code before and after
> > this patch generates exactly the same binary on SMP and UP.
> > 
> 
> Yes, you're right, sorry for my mistake...
> 
> > 
> > > Another thing is that smp_lwsync() may have a third user(other than
> > > smp_load_acquire() and smp_store_release()):
> > > 
> > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> > > 
> > > I'm OK to change my patch accordingly, but do we really want
> > > smp_lwsync() get involved in this cleanup? If I understand you
> > > correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> > > while smp_lwsync() is internal to PPC.
> > > 
> > > Regards,
> > > Boqun
> > 
> > I think you missed the leading ___ :)
> > 
> 
> What I mean here was smp_lwsync() was originally internal to PPC, but
> never mind ;-)
> 
> > smp_store_release is external and it needs __smp_lwsync as
> > defined here.
> > 
> > I can duplicate some code and have smp_lwsync *not* call __smp_lwsync
> 
> You mean bringing smp_lwsync() back? because I haven't seen you defining
> in asm-generic/barriers.h in previous patches and you just delete it in
> this patch.
> 
> > but why do this? Still, if you prefer it this way,
> > please let me know.
> > 
> 
> I think deleting smp_lwsync() is fine, though I need to change atomic
> variants patches on PPC because of it ;-/
> 
> Regards,
> Boqun

Sorry, I don't understand - why do you have to do anything?
I changed 

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 04:39:37PM +0100, Christian Borntraeger wrote:
> On 01/05/2016 10:30 AM, Michael S. Tsirkin wrote:
> 
> > 
> > arch/s390/kernel/vdso.c:smp_mb();
> > 
> > Looking at
> > Author: Christian Borntraeger 
> > Date:   Fri Sep 11 16:23:06 2015 +0200
> > 
> > s390/vdso: use correct memory barrier
> > 
> > By definition smp_wmb only orders writes against writes. (Finish all
> > previous writes, and do not start any future write). To protect the
> > vdso init code against early reads on other CPUs, let's use a full
> > smp_mb at the end of vdso init. As right now smp_wmb is implemented
> > as full serialization, this needs no stable backport, but this 
> > change
> > will be necessary if we reimplement smp_wmb.
> > 
> > ok from hypervisor point of view, but it's also strange:
> > 1. why isn't this paired with another mb somewhere?
> >this seems to violate barrier pairing rules.
> > 2. how does smp_mb protect against early reads on other CPUs?
> >It normally does not: it orders reads from this CPU versus writes
> >from same CPU. But init code does not appear to read anything.
> >Maybe this is some s390 specific trick?
> > 
> > I could not figure out the above commit.
> 
> It was probably me misreading the code. I change a wmb into a full mb here
> since I was changing the defintion of wmb to a compiler barrier. I tried to
> fixup all users of wmb that really pair with other code. I assumed that there
> must be some reader (as there was a wmb before) but I could not figure out
> which. So I just played safe here.
> 
> But it probably can be removed.
> 
> > arch/s390/kvm/kvm-s390.c:   smp_mb();
> 
> This can go. If you have a patch, I can carry that via the kvms390 tree,
> or I will spin a new patch with you as suggested-by.
> 
> Christian

I have both, will post shortly.

-- 
MST
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 05:03 PM, Rabin Vincent wrote:

On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote:

On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.


Is x86_64 part of this 'All' subset ? ;)


No, because it's an eBPF JIT.


Correct, filter conversion to eBPF clears it already.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Daniel Borkmann

On 01/05/2016 04:23 PM, Rabin Vincent wrote:

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent 


Excellent catch, thanks a lot! The fix looks good to me and should
go to -net tree.

Acked-by: Daniel Borkmann 

If you're interested, feel free to add a small test case for the
SKF_AD_ALU_XOR_X issue to lib/test_bpf.c for -net-next tree. Thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Christian Borntraeger
On 01/05/2016 10:30 AM, Michael S. Tsirkin wrote:

> 
> arch/s390/kernel/vdso.c:smp_mb();
> 
> Looking at
>   Author: Christian Borntraeger 
>   Date:   Fri Sep 11 16:23:06 2015 +0200
> 
>   s390/vdso: use correct memory barrier
> 
>   By definition smp_wmb only orders writes against writes. (Finish all
>   previous writes, and do not start any future write). To protect the
>   vdso init code against early reads on other CPUs, let's use a full
>   smp_mb at the end of vdso init. As right now smp_wmb is implemented
>   as full serialization, this needs no stable backport, but this 
> change
>   will be necessary if we reimplement smp_wmb.
> 
> ok from hypervisor point of view, but it's also strange:
> 1. why isn't this paired with another mb somewhere?
>this seems to violate barrier pairing rules.
> 2. how does smp_mb protect against early reads on other CPUs?
>It normally does not: it orders reads from this CPU versus writes
>from same CPU. But init code does not appear to read anything.
>Maybe this is some s390 specific trick?
> 
> I could not figure out the above commit.

It was probably me misreading the code. I change a wmb into a full mb here
since I was changing the defintion of wmb to a compiler barrier. I tried to
fixup all users of wmb that really pair with other code. I assumed that there
must be some reader (as there was a wmb before) but I could not figure out
which. So I just played safe here.

But it probably can be removed.

> arch/s390/kvm/kvm-s390.c:   smp_mb();

This can go. If you have a patch, I can carry that via the kvms390 tree,
or I will spin a new patch with you as suggested-by.

Christian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [alsa-devel] [PATCH 1/3] ASoC: fsl_ssi: mark SACNT register volatile

2016-01-05 Thread Fabio Estevam
Timur,

On Thu, Dec 24, 2015 at 2:12 PM, Timur Tabi  wrote:
> On Sun, Dec 20, 2015 at 2:30 PM, Maciej S. Szmigiero
>  wrote:
>> SACNT register should be marked volatile since
>> its WR and RD bits are cleared by SSI after
>> completing the relevant operation.
>> This unbreaks AC'97 register access.
>>
>> Fixes: 05cf237972fe ("ASoC: fsl_ssi: Add driver suspend and resume to 
>> support MEGA Fast")
>>
>> Signed-off-by: Maciej S. Szmigiero 
>
> These patches seem okay, but can we hold off merging them until I get
> back from vacation and have a chance to review them?

Have you had a chance to review these patches?
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 2/9] ppc64le FTRACE_WITH_REGS implementation

2016-01-05 Thread Torsten Duwe
On Wed, Dec 02, 2015 at 09:18:05AM +1100, Michael Ellerman wrote:
> 
> I (still) haven't had a chance to have a good look at it, but I won't this 
> week
> anyway. So post v5 and hopefully I can review that and it will be perfect :)

The perfect v5 is there now, for 4 weeks minus the holiday season, and no 
further
problems arose. Independently verified by Petr Mladek -- don't forget his 
high-level fix.

Torsten

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2016 15:04:43 +0200
"Michael S. Tsirkin"  wrote:

> On Tue, Jan 05, 2016 at 01:08:52PM +0100, Martin Schwidefsky wrote:
> > On Tue, 5 Jan 2016 11:30:19 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote:
> > > > On Mon, 4 Jan 2016 22:18:58 +0200
> > > > "Michael S. Tsirkin"  wrote:
> > > > 
> > > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > > > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > > > > > This defines __smp_xxx barriers for s390,
> > > > > > > for use by virtualization.
> > > > > > > 
> > > > > > > Some smp_xxx barriers are removed as they are
> > > > > > > defined correctly by asm-generic/barriers.h
> > > > > > > 
> > > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > > > > > unconditionally on this architecture.
> > > > > > > 
> > > > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > > > Acked-by: Arnd Bergmann 
> > > > > > > ---
> > > > > > >  arch/s390/include/asm/barrier.h | 15 +--
> > > > > > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > > > > b/arch/s390/include/asm/barrier.h
> > > > > > > index c358c31..fbd25b2 100644
> > > > > > > --- a/arch/s390/include/asm/barrier.h
> > > > > > > +++ b/arch/s390/include/asm/barrier.h
> > > > > > > @@ -26,18 +26,21 @@
> > > > > > >  #define wmb()barrier()
> > > > > > >  #define dma_rmb()mb()
> > > > > > >  #define dma_wmb()mb()
> > > > > > > -#define smp_mb() mb()
> > > > > > > -#define smp_rmb()rmb()
> > > > > > > -#define smp_wmb()wmb()
> > > > > > > -
> > > > > > > -#define smp_store_release(p, v)  
> > > > > > > \
> > > > > > > +#define __smp_mb()   mb()
> > > > > > > +#define __smp_rmb()  rmb()
> > > > > > > +#define __smp_wmb()  wmb()
> > > > > > > +#define smp_mb() __smp_mb()
> > > > > > > +#define smp_rmb()__smp_rmb()
> > > > > > > +#define smp_wmb()__smp_wmb()
> > > > > > 
> > > > > > Why define the smp_*mb() primitives here? Would not the inclusion of
> > > > > > asm-generic/barrier.h do this?
> > > > > 
> > > > > No because the generic one is a nop on !SMP, this one isn't.
> > > > > 
> > > > > Pls note this patch is just reordering code without making
> > > > > functional changes.
> > > > > And at the moment, on s390 smp_xxx barriers are always non empty.
> > > > 
> > > > The s390 kernel is SMP to 99.99%, we just didn't bother with a
> > > > non-smp variant for the memory-barriers. If the generic header
> > > > is used we'd get the non-smp version for free. It will save a
> > > > small amount of text space for CONFIG_SMP=n. 
> > > 
> > > OK, so I'll queue a patch to do this then?
> > 
> > Yes please.
> 
> OK, I'll add a patch on top in v3.

Good, with this addition:

Acked-by: Martin Schwidefsky 

> > > Just to make sure: the question would be, are smp_xxx barriers ever used
> > > in s390 arch specific code to flush in/out memory accesses for
> > > synchronization with the hypervisor?
> > > 
> > > I went over s390 arch code and it seems to me the answer is no
> > > (except of course for virtio).
> > 
> > Correct. Guest to host communication either uses instructions which
> > imply a memory barrier or QDIO which uses atomics.
> 
> And atomics imply a barrier on s390, right?

Yes they do.

> > > But I also see a lot of weirdness on this architecture.
> > 
> > Mostly historical, s390 actually is one of the easiest architectures in
> > regard to memory barriers.
> > 
> > > I found these calls:
> > > 
> > > arch/s390/include/asm/bitops.h: smp_mb__before_atomic();
> > > arch/s390/include/asm/bitops.h: smp_mb();
> > > 
> > > Not used in arch specific code so this is likely OK.
> > 
> > This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb
> > and smp_mb__before_atomic are used in clear_bit_unlock and
> > __clear_bit_unlock which are 1:1 copies from the code in
> > include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs
> > from the generic implementation.
> 
> something to keep in mind, but
> I'd rather not touch bitops at the moment - this patchset is already too big.

With the conversion smp_mb__before_atomic to a barrier() it does the
correct thing. I don't think that any chance is necessary.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 17/32] arm: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Mon, Jan 04, 2016 at 01:59:34PM +, Russell King - ARM Linux wrote:
> On Mon, Jan 04, 2016 at 02:54:20PM +0100, Peter Zijlstra wrote:
> > On Mon, Jan 04, 2016 at 02:36:58PM +0100, Peter Zijlstra wrote:
> > > On Sun, Jan 03, 2016 at 11:12:44AM +0200, Michael S. Tsirkin wrote:
> > > > On Sat, Jan 02, 2016 at 11:24:38AM +, Russell King - ARM Linux 
> > > > wrote:
> > > 
> > > > > My only concern is that it gives people an additional handle onto a
> > > > > "new" set of barriers - just because they're prefixed with __*
> > > > > unfortunately doesn't stop anyone from using it (been there with
> > > > > other arch stuff before.)
> > > > > 
> > > > > I wonder whether we should consider making the smp memory barriers
> > > > > inline functions, so these __smp_xxx() variants can be undef'd
> > > > > afterwards, thereby preventing drivers getting their hands on these
> > > > > new macros?
> > > > 
> > > > That'd be tricky to do cleanly since asm-generic depends on
> > > > ifndef to add generic variants where needed.
> > > > 
> > > > But it would be possible to add a checkpatch test for this.
> > > 
> > > Wasn't the whole purpose of these things for 'drivers' (namely
> > > virtio/xen hypervisor interaction) to use these?
> > 
> > Ah, I see, you add virt_*mb() stuff later on for that use case.
> > 
> > So, assuming everybody does include asm-generic/barrier.h, you could
> > simply #undef the __smp version at the end of that, once we've generated
> > all the regular primitives from it, no?
> 
> Not so simple - that's why I mentioned using inline functions.
> 
> The new smp_* _macros_ are:
> 
> +#define smp_mb()   __smp_mb()
> 
> which means if we simply #undef __smp_mb(), smp_mb() then points at
> something which is no longer available, and we'll end up with errors
> saying that __smp_mb() doesn't exist.
> 
> My suggestion was to change:
> 
> #ifndef smp_mb
> #define smp_mb()  __smp_mb()
> #endif
> 
> to:
> 
> #ifndef smp_mb
> static inline void smp_mb(void)
> {
>   __smp_mb();
> }
> #endif
> 
> which then means __smp_mb() and friends can be #undef'd afterwards.

Absolutely, I got it.
The issue is that e.g. tile has:
#define __smp_mb__after_atomic()do { } while (0)

and this is cheaper than barrier().

For this reason I left
#define smp_mb__after_atomic()  __smp_mb__after_atomic()
in place there.

Now, of course I can do (in asm-generic):

#ifndef smp_mb__after_atomic
static inline void smp_mb__after_atomic(void)
{
...
}
#endif

but this seems ugly: architectures do defines, generic
version does inline.


And that is not all: APIs like smp_store_mb can take
a variety of types as arguments so they pretty much
must be implemented as macros.

Teaching checkpatch.pl to complain about it seems like the cleanest
approach.

> -- 
> RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 0/2] cpufreq: powernv: Redesign the presentation of throttle notification

2016-01-05 Thread Stewart Smith
Shilpasri G Bhat  writes:
> In POWER8, OCC(On-Chip-Controller) can throttle the frequency of the
> CPU when the chip crosses its thermal and power limits. Currently,
> powernv-cpufreq driver detects and reports this event as a console
> message. Some boxes may not sustain the max turbo frequency in all
> conditions and can be throttled frequently. This can lead to the
> flooding of console with throttle messages. So this patchset aims to
> redesign the presentation of this event via sysfs counters and
> tracepoints.

This likely should CC stable@ as we have seen this on some machines, at
least in the lab.

-- 
Stewart Smith
OPAL Architect, IBM.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread David Miller
From: Rabin Vincent 
Date: Tue,  5 Jan 2016 16:23:07 +0100

> The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> instructions since it XORs A with X while all the others replace A with
> some loaded value.  All the BPF JITs fail to clear A if this is used as
> the first instruction in a filter.  This was found using american fuzzy
> lop.
> 
> Add a helper to determine if A needs to be cleared given the first
> instruction in a filter, and use this in the JITs.  Except for ARM, the
> rest have only been compile-tested.
> 
> Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
> Signed-off-by: Rabin Vincent 

Applied and queued up for -stable, thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Rabin Vincent
The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent 
---
 arch/arm/net/bpf_jit_32.c   | 16 +---
 arch/mips/net/bpf_jit.c | 16 +---
 arch/powerpc/net/bpf_jit_comp.c | 13 ++---
 arch/sparc/net/bpf_jit_comp.c   | 17 ++---
 include/linux/filter.h  | 19 +++
 5 files changed, 25 insertions(+), 56 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 591f9db3bf40..e153eb065fe4 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -187,19 +187,6 @@ static inline int mem_words_used(struct jit_ctx *ctx)
return fls(ctx->seen & SEEN_MEM);
 }
 
-static inline bool is_load_to_a(u16 inst)
-{
-   switch (inst) {
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   return true;
-   default:
-   return false;
-   }
-}
-
 static void jit_fill_hole(void *area, unsigned int size)
 {
u32 *ptr;
@@ -211,7 +198,6 @@ static void jit_fill_hole(void *area, unsigned int size)
 static void build_prologue(struct jit_ctx *ctx)
 {
u16 reg_set = saved_regs(ctx);
-   u16 first_inst = ctx->skf->insns[0].code;
u16 off;
 
 #ifdef CONFIG_FRAME_POINTER
@@ -241,7 +227,7 @@ static void build_prologue(struct jit_ctx *ctx)
emit(ARM_MOV_I(r_X, 0), ctx);
 
/* do not leak kernel data to userspace */
-   if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst)))
+   if (bpf_needs_clear_a(>skf->insns[0]))
emit(ARM_MOV_I(r_A, 0), ctx);
 
/* stack space for the BPF_MEM words */
diff --git a/arch/mips/net/bpf_jit.c b/arch/mips/net/bpf_jit.c
index 77cb27309db2..1a8c96035716 100644
--- a/arch/mips/net/bpf_jit.c
+++ b/arch/mips/net/bpf_jit.c
@@ -521,19 +521,6 @@ static inline u16 align_sp(unsigned int num)
return num;
 }
 
-static bool is_load_to_a(u16 inst)
-{
-   switch (inst) {
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   return true;
-   default:
-   return false;
-   }
-}
-
 static void save_bpf_jit_regs(struct jit_ctx *ctx, unsigned offset)
 {
int i = 0, real_off = 0;
@@ -614,7 +601,6 @@ static unsigned int get_stack_depth(struct jit_ctx *ctx)
 
 static void build_prologue(struct jit_ctx *ctx)
 {
-   u16 first_inst = ctx->skf->insns[0].code;
int sp_off;
 
/* Calculate the total offset for the stack pointer */
@@ -641,7 +627,7 @@ static void build_prologue(struct jit_ctx *ctx)
emit_jit_reg_move(r_X, r_zero, ctx);
 
/* Do not leak kernel data to userspace */
-   if ((first_inst != (BPF_RET | BPF_K)) && !(is_load_to_a(first_inst)))
+   if (bpf_needs_clear_a(>skf->insns[0]))
emit_jit_reg_move(r_A, r_zero, ctx);
 }
 
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 04782164ee67..2d66a8446198 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -78,18 +78,9 @@ static void bpf_jit_build_prologue(struct bpf_prog *fp, u32 
*image,
PPC_LI(r_X, 0);
}
 
-   switch (filter[0].code) {
-   case BPF_RET | BPF_K:
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-   case BPF_LD | BPF_H | BPF_ABS:
-   case BPF_LD | BPF_B | BPF_ABS:
-   /* first instruction sets A register (or is RET 'constant') */
-   break;
-   default:
-   /* make sure we dont leak kernel information to user */
+   /* make sure we dont leak kernel information to user */
+   if (bpf_needs_clear_a([0]))
PPC_LI(r_A, 0);
-   }
 }
 
 static void bpf_jit_build_epilogue(u32 *image, struct codegen_context *ctx)
diff --git a/arch/sparc/net/bpf_jit_comp.c b/arch/sparc/net/bpf_jit_comp.c
index 22564f5f2364..3e6e05a7c4c2 100644
--- a/arch/sparc/net/bpf_jit_comp.c
+++ b/arch/sparc/net/bpf_jit_comp.c
@@ -420,22 +420,9 @@ void bpf_jit_compile(struct bpf_prog *fp)
}
emit_reg_move(O7, r_saved_O7);
 
-   switch (filter[0].code) {
-   case BPF_RET | BPF_K:
-   case BPF_LD | BPF_W | BPF_LEN:
-   case BPF_LD | BPF_W | BPF_ABS:
-  

Re: [RFC PATCH v2 1/3] PCI: Add support for enforcing all MMIO BARs to be page aligned

2016-01-05 Thread Yongji Xie

On 2016/1/5 4:47, Alex Williamson wrote:

On Thu, 2015-12-31 at 16:50 +0800, Yongji Xie wrote:

When vfio passthrough a PCI device of which MMIO BARs
are smaller than PAGE_SIZE, guest will not handle the
mmio accesses to the BARs which leads to mmio emulations
in host.

This is because vfio will not allow to passthrough one
BAR's mmio page which may be shared with other BARs.

To solve this performance issue, this patch adds a kernel
parameter "pci=resource_page_aligned=on" to enforce
the alignments of all MMIO BARs to be at least PAGE_SIZE,
so that one BAR's mmio page would not be shared with other
BARs. We can also disable it through kernel parameter
"pci=resource_page_aligned=off".

Shouldn't this somehow be associated with the realloc option?  I don't
think PCI code will attempt to reprogram anything unless it needs to
otherwise.


So you mean we need to ignore firmware setup and force re-assigning all
resources if we want to use the option "pci=resource_page_aligned=on"?


For the default value of this parameter, we think it should be
arch-independent, so we add a macro PCI_RESOURCE_PAGE_ALIGNED
to change it. And we define this macro to enable this parameter
by default on PPC64 platform which can easily hit this
performance issue because its PAGE_SIZE is 64KB.

Signed-off-by: Yongji Xie 
---
  Documentation/kernel-parameters.txt |4 
  arch/powerpc/include/asm/pci.h  |   11 +++
  drivers/pci/pci.c   |   17 +
  drivers/pci/pci.h   |7 ++-
  include/linux/pci.h |2 ++
  5 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 742f69d..a53aaee 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2857,6 +2857,10 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
PAGE_SIZE is used as alignment.
PCI-PCI bridge can be specified, if resource
windows need to be expanded.
+   resource_page_aligned=  Enable/disable enforcing the alignment
+   of all PCI devices' memory resources to be
+   at least PAGE_SIZE.
+   Format: { "on" | "off" }
ecrc=   Enable/disable PCIe ECRC (transaction layer
end-to-end CRC checking).
bios: Use BIOS/firmware settings. This is the
diff --git a/arch/powerpc/include/asm/pci.h b/arch/powerpc/include/asm/pci.h
index 3453bd8..27bff59 100644
--- a/arch/powerpc/include/asm/pci.h
+++ b/arch/powerpc/include/asm/pci.h
@@ -136,6 +136,17 @@ extern pgprot_tpci_phys_mem_access_prot(struct file 
*file,
 unsigned long pfn,
 unsigned long size,
 pgprot_t prot);
+#ifdef CONFIG_PPC64
+
+/* For PPC64, We enforce all PCI MMIO BARs to be page aligned
+ * by default. This would be helpful to improve performance
+ * when we passthrough a PCI device of which BARs are smaller
+ * than PAGE_SIZE(64KB). And we can use bootcmd
+ * "pci=resource_page_aligned=off" to disable it.
+ */
+#define PCI_ENABLE_RESOURCE_PAGE_ALIGNED
+
+#endif

This should be done with something like HAVE_PCI_DEFAULT_RESOURCE_PAGE_
ALIGNED in arch/powerpc/include/asm


OK, I will fix it in next version.


  #define HAVE_ARCH_PCI_RESOURCE_TO_USER
  extern void pci_resource_to_user(const struct pci_dev *dev, int bar,
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 314db8c..9f14ba5 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -99,6 +99,13 @@ u8 pci_cache_line_size;
   */
  unsigned int pcibios_max_latency = 255;
  
+#ifdef PCI_ENABLE_RESOURCE_PAGE_ALIGNED

+bool pci_resource_page_aligned = true;
+#else
+bool pci_resource_page_aligned;
+#endif
+EXPORT_SYMBOL(pci_resource_page_aligned);

Couldn't this be done in a single line with IS_ENABLED() macro?


I'm not sure whether IS_ENABLED() macro should be used there because
it is always used for CONFIG_ options.


Should this symbol be GPL-only?


Yes, it will be fixed in next version.


+
  /* If set, the PCIe ARI capability will not be used. */
  static bool pcie_ari_disabled;
  
@@ -4746,6 +4753,14 @@ static ssize_t pci_resource_alignment_store(struct bus_type *bus,

  BUS_ATTR(resource_alignment, 0644, pci_resource_alignment_show,
pci_resource_alignment_store);
  
+static void pci_resource_get_page_aligned(char *str)

+{
+   if (!strncmp(str, "off", 3))
+   pci_resource_page_aligned = false;
+   else if (!strncmp(str, "on", 2))
+   pci_resource_page_aligned = true;
+}
+
  static int __init pci_resource_alignment_sysfs_init(void)
  {
 

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Michael S. Tsirkin
On Tue, Jan 05, 2016 at 01:08:52PM +0100, Martin Schwidefsky wrote:
> On Tue, 5 Jan 2016 11:30:19 +0200
> "Michael S. Tsirkin"  wrote:
> 
> > On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote:
> > > On Mon, 4 Jan 2016 22:18:58 +0200
> > > "Michael S. Tsirkin"  wrote:
> > > 
> > > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > > > > This defines __smp_xxx barriers for s390,
> > > > > > for use by virtualization.
> > > > > > 
> > > > > > Some smp_xxx barriers are removed as they are
> > > > > > defined correctly by asm-generic/barriers.h
> > > > > > 
> > > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > > > > unconditionally on this architecture.
> > > > > > 
> > > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > > Acked-by: Arnd Bergmann 
> > > > > > ---
> > > > > >  arch/s390/include/asm/barrier.h | 15 +--
> > > > > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > > > b/arch/s390/include/asm/barrier.h
> > > > > > index c358c31..fbd25b2 100644
> > > > > > --- a/arch/s390/include/asm/barrier.h
> > > > > > +++ b/arch/s390/include/asm/barrier.h
> > > > > > @@ -26,18 +26,21 @@
> > > > > >  #define wmb()  barrier()
> > > > > >  #define dma_rmb()  mb()
> > > > > >  #define dma_wmb()  mb()
> > > > > > -#define smp_mb()   mb()
> > > > > > -#define smp_rmb()  rmb()
> > > > > > -#define smp_wmb()  wmb()
> > > > > > -
> > > > > > -#define smp_store_release(p, v)
> > > > > > \
> > > > > > +#define __smp_mb() mb()
> > > > > > +#define __smp_rmb()rmb()
> > > > > > +#define __smp_wmb()wmb()
> > > > > > +#define smp_mb()   __smp_mb()
> > > > > > +#define smp_rmb()  __smp_rmb()
> > > > > > +#define smp_wmb()  __smp_wmb()
> > > > > 
> > > > > Why define the smp_*mb() primitives here? Would not the inclusion of
> > > > > asm-generic/barrier.h do this?
> > > > 
> > > > No because the generic one is a nop on !SMP, this one isn't.
> > > > 
> > > > Pls note this patch is just reordering code without making
> > > > functional changes.
> > > > And at the moment, on s390 smp_xxx barriers are always non empty.
> > > 
> > > The s390 kernel is SMP to 99.99%, we just didn't bother with a
> > > non-smp variant for the memory-barriers. If the generic header
> > > is used we'd get the non-smp version for free. It will save a
> > > small amount of text space for CONFIG_SMP=n. 
> > 
> > OK, so I'll queue a patch to do this then?
> 
> Yes please.

OK, I'll add a patch on top in v3.

> > Just to make sure: the question would be, are smp_xxx barriers ever used
> > in s390 arch specific code to flush in/out memory accesses for
> > synchronization with the hypervisor?
> > 
> > I went over s390 arch code and it seems to me the answer is no
> > (except of course for virtio).
> 
> Correct. Guest to host communication either uses instructions which
> imply a memory barrier or QDIO which uses atomics.

And atomics imply a barrier on s390, right?

> > But I also see a lot of weirdness on this architecture.
> 
> Mostly historical, s390 actually is one of the easiest architectures in
> regard to memory barriers.
> 
> > I found these calls:
> > 
> > arch/s390/include/asm/bitops.h: smp_mb__before_atomic();
> > arch/s390/include/asm/bitops.h: smp_mb();
> > 
> > Not used in arch specific code so this is likely OK.
> 
> This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb
> and smp_mb__before_atomic are used in clear_bit_unlock and
> __clear_bit_unlock which are 1:1 copies from the code in
> include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs
> from the generic implementation.

something to keep in mind, but
I'd rather not touch bitops at the moment - this patchset is already too big.

> > arch/s390/kernel/vdso.c:smp_mb();
> > 
> > Looking at
> > Author: Christian Borntraeger 
> > Date:   Fri Sep 11 16:23:06 2015 +0200
> > 
> > s390/vdso: use correct memory barrier
> > 
> > By definition smp_wmb only orders writes against writes. (Finish all
> > previous writes, and do not start any future write). To protect the
> > vdso init code against early reads on other CPUs, let's use a full
> > smp_mb at the end of vdso init. As right now smp_wmb is implemented
> > as full serialization, this needs no stable backport, but this 
> > change
> > will be necessary if we reimplement smp_wmb.
> > 
> > ok from hypervisor point of view, but it's also strange:
> > 1. why isn't this 

[PATCH 1/2] powerpc: sparse: static-ify some things

2016-01-05 Thread Daniel Axtens
As sparse suggests, these should be made static.

Signed-off-by: Daniel Axtens 

---

These are random fixes in arch/powerpc/kernel: there's no real
pattern to them. It doesn't fix everything.
---
 arch/powerpc/kernel/eeh_event.c | 2 +-
 arch/powerpc/kernel/ibmebus.c   | 2 +-
 arch/powerpc/kernel/mce.c   | 2 +-
 arch/powerpc/kernel/rtasd.c | 2 +-
 arch/powerpc/kernel/vio.c   | 4 ++--
 5 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/eeh_event.c b/arch/powerpc/kernel/eeh_event.c
index 4eefb6e34dbb..82e7327e3cd0 100644
--- a/arch/powerpc/kernel/eeh_event.c
+++ b/arch/powerpc/kernel/eeh_event.c
@@ -36,7 +36,7 @@
 
 static DEFINE_SPINLOCK(eeh_eventlist_lock);
 static struct semaphore eeh_eventlist_sem;
-LIST_HEAD(eeh_eventlist);
+static LIST_HEAD(eeh_eventlist);
 
 /**
  * eeh_event_handler - Dispatch EEH events.
diff --git a/arch/powerpc/kernel/ibmebus.c b/arch/powerpc/kernel/ibmebus.c
index ac86c53e2542..a89f4f7a66bd 100644
--- a/arch/powerpc/kernel/ibmebus.c
+++ b/arch/powerpc/kernel/ibmebus.c
@@ -408,7 +408,7 @@ static ssize_t modalias_show(struct device *dev,
return len+1;
 }
 
-struct device_attribute ibmebus_bus_device_attrs[] = {
+static struct device_attribute ibmebus_bus_device_attrs[] = {
__ATTR_RO(devspec),
__ATTR_RO(name),
__ATTR_RO(modalias),
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index b2eb4686bd8f..35138225af6e 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -37,7 +37,7 @@ static DEFINE_PER_CPU(int, mce_queue_count);
 static DEFINE_PER_CPU(struct machine_check_event[MAX_MC_EVT], mce_event_queue);
 
 static void machine_check_process_queued_event(struct irq_work *work);
-struct irq_work mce_event_process_work = {
+static struct irq_work mce_event_process_work = {
 .func = machine_check_process_queued_event,
 };
 
diff --git a/arch/powerpc/kernel/rtasd.c b/arch/powerpc/kernel/rtasd.c
index 5a2c049c1c61..c07f4b665b6b 100644
--- a/arch/powerpc/kernel/rtasd.c
+++ b/arch/powerpc/kernel/rtasd.c
@@ -442,7 +442,7 @@ static void do_event_scan(void)
 }
 
 static void rtas_event_scan(struct work_struct *w);
-DECLARE_DELAYED_WORK(event_scan_work, rtas_event_scan);
+static DECLARE_DELAYED_WORK(event_scan_work, rtas_event_scan);
 
 /*
  * Delay should be at least one second since some machines have problems if
diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c
index 5f8dcdaa2820..8d7358f3a273 100644
--- a/arch/powerpc/kernel/vio.c
+++ b/arch/powerpc/kernel/vio.c
@@ -87,7 +87,7 @@ struct vio_cmo_dev_entry {
  * @curr: bytes currently allocated
  * @high: high water mark for IO data usage
  */
-struct vio_cmo {
+static struct vio_cmo {
spinlock_t lock;
struct delayed_work balance_q;
struct list_head device_list;
@@ -615,7 +615,7 @@ static u64 vio_dma_get_required_mask(struct device *dev)
 return dma_iommu_ops.get_required_mask(dev);
 }
 
-struct dma_map_ops vio_dma_mapping_ops = {
+static struct dma_map_ops vio_dma_mapping_ops = {
.alloc = vio_dma_iommu_alloc_coherent,
.free  = vio_dma_iommu_free_coherent,
.mmap  = dma_direct_mmap_coherent,
-- 
2.6.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] powerpc: sparse: Include headers for __weak symbols

2016-01-05 Thread Daniel Axtens
Sometimes when sparse warns about undefined symbols, it isn't
because they should have 'static' added, it's because they're
overriding __weak symbols defined elsewhere, and the header has
been missed.

Fix a few of them by adding appropriate headers.

Signed-off-by: Daniel Axtens 
---
 arch/powerpc/kernel/process.c | 2 ++
 arch/powerpc/kernel/prom.c| 1 +
 arch/powerpc/kernel/time.c| 1 +
 arch/powerpc/mm/mmap.c| 1 +
 4 files changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index dccc87e8fee5..afd81390add0 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -55,6 +56,7 @@
 #include 
 #endif
 #include 
+#include 
 #include 
 #include 
 
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 7030b035905d..1b082c729c29 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 81b0900a39ee..3ed9a5a21d77 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index 0f0502e12f6c..1ccb2f6ecaf9 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -26,6 +26,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * Top of mmap area (just below the process stack).
-- 
2.6.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-05 Thread Rabin Vincent
On Tue, Jan 05, 2016 at 08:00:45AM -0800, Eric Dumazet wrote:
> On Tue, 2016-01-05 at 16:23 +0100, Rabin Vincent wrote:
> > The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
> > instructions since it XORs A with X while all the others replace A with
> > some loaded value.  All the BPF JITs fail to clear A if this is used as
> > the first instruction in a filter.
> 
> Is x86_64 part of this 'All' subset ? ;)

No, because it's an eBPF JIT.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 12/32] x86/um: reuse asm-generic/barrier.h

2016-01-05 Thread Richard Weinberger
Am 31.12.2015 um 20:07 schrieb Michael S. Tsirkin:
> On x86/um CONFIG_SMP is never defined.  As a result, several macros
> match the asm-generic variant exactly. Drop the local definitions and
> pull in asm-generic/barrier.h instead.
> 
> This is in preparation to refactoring this code area.
> 
> Signed-off-by: Michael S. Tsirkin 
> Acked-by: Arnd Bergmann 

Acked-by: Richard Weinberger 

Thanks,
//richard
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 22/32] s390: define __smp_xxx

2016-01-05 Thread Martin Schwidefsky
On Tue, 5 Jan 2016 11:30:19 +0200
"Michael S. Tsirkin"  wrote:

> On Tue, Jan 05, 2016 at 09:13:19AM +0100, Martin Schwidefsky wrote:
> > On Mon, 4 Jan 2016 22:18:58 +0200
> > "Michael S. Tsirkin"  wrote:
> > 
> > > On Mon, Jan 04, 2016 at 02:45:25PM +0100, Peter Zijlstra wrote:
> > > > On Thu, Dec 31, 2015 at 09:08:38PM +0200, Michael S. Tsirkin wrote:
> > > > > This defines __smp_xxx barriers for s390,
> > > > > for use by virtualization.
> > > > > 
> > > > > Some smp_xxx barriers are removed as they are
> > > > > defined correctly by asm-generic/barriers.h
> > > > > 
> > > > > Note: smp_mb, smp_rmb and smp_wmb are defined as full barriers
> > > > > unconditionally on this architecture.
> > > > > 
> > > > > Signed-off-by: Michael S. Tsirkin 
> > > > > Acked-by: Arnd Bergmann 
> > > > > ---
> > > > >  arch/s390/include/asm/barrier.h | 15 +--
> > > > >  1 file changed, 9 insertions(+), 6 deletions(-)
> > > > > 
> > > > > diff --git a/arch/s390/include/asm/barrier.h 
> > > > > b/arch/s390/include/asm/barrier.h
> > > > > index c358c31..fbd25b2 100644
> > > > > --- a/arch/s390/include/asm/barrier.h
> > > > > +++ b/arch/s390/include/asm/barrier.h
> > > > > @@ -26,18 +26,21 @@
> > > > >  #define wmb()barrier()
> > > > >  #define dma_rmb()mb()
> > > > >  #define dma_wmb()mb()
> > > > > -#define smp_mb() mb()
> > > > > -#define smp_rmb()rmb()
> > > > > -#define smp_wmb()wmb()
> > > > > -
> > > > > -#define smp_store_release(p, v)  
> > > > > \
> > > > > +#define __smp_mb()   mb()
> > > > > +#define __smp_rmb()  rmb()
> > > > > +#define __smp_wmb()  wmb()
> > > > > +#define smp_mb() __smp_mb()
> > > > > +#define smp_rmb()__smp_rmb()
> > > > > +#define smp_wmb()__smp_wmb()
> > > > 
> > > > Why define the smp_*mb() primitives here? Would not the inclusion of
> > > > asm-generic/barrier.h do this?
> > > 
> > > No because the generic one is a nop on !SMP, this one isn't.
> > > 
> > > Pls note this patch is just reordering code without making
> > > functional changes.
> > > And at the moment, on s390 smp_xxx barriers are always non empty.
> > 
> > The s390 kernel is SMP to 99.99%, we just didn't bother with a
> > non-smp variant for the memory-barriers. If the generic header
> > is used we'd get the non-smp version for free. It will save a
> > small amount of text space for CONFIG_SMP=n. 
> 
> OK, so I'll queue a patch to do this then?

Yes please.
 
> Just to make sure: the question would be, are smp_xxx barriers ever used
> in s390 arch specific code to flush in/out memory accesses for
> synchronization with the hypervisor?
> 
> I went over s390 arch code and it seems to me the answer is no
> (except of course for virtio).

Correct. Guest to host communication either uses instructions which
imply a memory barrier or QDIO which uses atomics.

> But I also see a lot of weirdness on this architecture.

Mostly historical, s390 actually is one of the easiest architectures in
regard to memory barriers.

> I found these calls:
> 
> arch/s390/include/asm/bitops.h: smp_mb__before_atomic();
> arch/s390/include/asm/bitops.h: smp_mb();
> 
> Not used in arch specific code so this is likely OK.

This has been introduced with git commit 5402ea6af11dc5a9, the smp_mb
and smp_mb__before_atomic are used in clear_bit_unlock and
__clear_bit_unlock which are 1:1 copies from the code in
include/asm-generic/bitops/lock.h. Only test_and_set_bit_lock differs
from the generic implementation.

> arch/s390/kernel/vdso.c:smp_mb();
> 
> Looking at
>   Author: Christian Borntraeger 
>   Date:   Fri Sep 11 16:23:06 2015 +0200
> 
>   s390/vdso: use correct memory barrier
> 
>   By definition smp_wmb only orders writes against writes. (Finish all
>   previous writes, and do not start any future write). To protect the
>   vdso init code against early reads on other CPUs, let's use a full
>   smp_mb at the end of vdso init. As right now smp_wmb is implemented
>   as full serialization, this needs no stable backport, but this 
> change
>   will be necessary if we reimplement smp_wmb.
> 
> ok from hypervisor point of view, but it's also strange:
> 1. why isn't this paired with another mb somewhere?
>this seems to violate barrier pairing rules.
> 2. how does smp_mb protect against early reads on other CPUs?
>It normally does not: it orders reads from this CPU versus writes
>from same CPU. But init code does not appear to read anything.
>Maybe this is some s390 specific trick?
> 
> I could not figure out the above commit.

That smp_mb can be removed. The initial s390 vdso code is heavily 

Re: [PATCH v2 15/32] powerpc: define __smp_xxx

2016-01-05 Thread Boqun Feng
On Tue, Jan 05, 2016 at 06:16:48PM +0200, Michael S. Tsirkin wrote:
[snip]
> > > > Another thing is that smp_lwsync() may have a third user(other than
> > > > smp_load_acquire() and smp_store_release()):
> > > > 
> > > > http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877
> > > > 
> > > > I'm OK to change my patch accordingly, but do we really want
> > > > smp_lwsync() get involved in this cleanup? If I understand you
> > > > correctly, this cleanup focuses on external API like smp_{r,w,}mb(),
> > > > while smp_lwsync() is internal to PPC.
> > > > 
> > > > Regards,
> > > > Boqun
> > > 
> > > I think you missed the leading ___ :)
> > > 
> > 
> > What I mean here was smp_lwsync() was originally internal to PPC, but
> > never mind ;-)
> > 
> > > smp_store_release is external and it needs __smp_lwsync as
> > > defined here.
> > > 
> > > I can duplicate some code and have smp_lwsync *not* call __smp_lwsync
> > 
> > You mean bringing smp_lwsync() back? because I haven't seen you defining
> > in asm-generic/barriers.h in previous patches and you just delete it in
> > this patch.
> > 
> > > but why do this? Still, if you prefer it this way,
> > > please let me know.
> > > 
> > 
> > I think deleting smp_lwsync() is fine, though I need to change atomic
> > variants patches on PPC because of it ;-/
> > 
> > Regards,
> > Boqun
> 
> Sorry, I don't understand - why do you have to do anything?
> I changed all users of smp_lwsync so they
> use __smp_lwsync on SMP and barrier() on !SMP.
> 
> This is exactly the current behaviour, I also tested that
> generated code does not change at all.
> 
> Is there a patch in your tree that conflicts with this?
> 

Because in a patchset which implements atomic relaxed/acquire/release
variants on PPC I use smp_lwsync(), this makes it have another user,
please see this mail:

http://article.gmane.org/gmane.linux.ports.ppc.embedded/89877

in definition of PPC's __atomic_op_release().


But I think removing smp_lwsync() is a good idea and actually I think we
can go further to remove __smp_lwsync() and let __smp_load_acquire and
__smp_store_release call __lwsync() directly, but that is another thing.

Anyway, I will modify my patch.

Regards,
Boqun

> 
> > > > >   WRITE_ONCE(*p, v);  
> > > > > \
> > > > >  } while (0)
> > > > >  
> > > > > -#define smp_load_acquire(p)  
> > > > > \
> > > > > +#define __smp_load_acquire(p)
> > > > > \
> > > > >  ({   
> > > > > \
> > > > >   typeof(*p) ___p1 = READ_ONCE(*p);   
> > > > > \
> > > > >   compiletime_assert_atomic_type(*p); 
> > > > > \
> > > > > - smp_lwsync();   
> > > > > \
> > > > > + __smp_lwsync(); 
> > > > > \
> > > > >   ___p1;  
> > > > > \
> > > > >  })
> > > > >  
> > > > > -- 
> > > > > MST
> > > > > 
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe 
> > > > > linux-kernel" in
> > > > > the body of a message to majord...@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > Please read the FAQ at  http://www.tux.org/lkml/
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v6 2/4] powerpc: atomic: Implement atomic{, 64}_*_return_* variants

2016-01-05 Thread Boqun Feng
Hi all,

I will resend this one to avoid a potential conflict with:

http://article.gmane.org/gmane.linux.kernel/2116880

by open coding smp_lwsync() with:

__asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");

Regards,
Boqun


signature.asc
Description: PGP signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH RESEND v6 2/4] powerpc: atomic: Implement atomic{, 64}_*_return_* variants

2016-01-05 Thread Boqun Feng
On powerpc, acquire and release semantics can be achieved with
lightweight barriers("lwsync" and "ctrl+isync"), which can be used to
implement __atomic_op_{acquire,release}.

For release semantics, since we only need to ensure all memory accesses
that issue before must take effects before the -store- part of the
atomics, "lwsync" is what we only need. On the platform without
"lwsync", "sync" should be used. Therefore in __atomic_op_release() we
use PPC_RELEASE_BARRIER.

For acquire semantics, "lwsync" is what we only need for the similar
reason.  However on the platform without "lwsync", we can use "isync"
rather than "sync" as an acquire barrier. Therefore in
__atomic_op_acquire() we use PPC_ACQUIRE_BARRIER, which is barrier() on
UP, "lwsync" if available and "isync" otherwise.

Implement atomic{,64}_{add,sub,inc,dec}_return_relaxed, and build other
variants with these helpers.

Signed-off-by: Boqun Feng 
---
 arch/powerpc/include/asm/atomic.h | 147 ++
 1 file changed, 85 insertions(+), 62 deletions(-)

diff --git a/arch/powerpc/include/asm/atomic.h 
b/arch/powerpc/include/asm/atomic.h
index 55f106e..a35c277 100644
--- a/arch/powerpc/include/asm/atomic.h
+++ b/arch/powerpc/include/asm/atomic.h
@@ -12,6 +12,24 @@
 
 #define ATOMIC_INIT(i) { (i) }
 
+/*
+ * Since *_return_relaxed and {cmp}xchg_relaxed are implemented with
+ * a "bne-" instruction at the end, so an isync is enough as a acquire barrier
+ * on the platform without lwsync.
+ */
+#define __atomic_op_acquire(op, args...)   \
+({ \
+   typeof(op##_relaxed(args)) __ret  = op##_relaxed(args); \
+   __asm__ __volatile__(PPC_ACQUIRE_BARRIER "" : : : "memory");\
+   __ret;  \
+})
+
+#define __atomic_op_release(op, args...)   \
+({ \
+   __asm__ __volatile__(PPC_RELEASE_BARRIER "" : : : "memory");\
+   op##_relaxed(args); \
+})
+
 static __inline__ int atomic_read(const atomic_t *v)
 {
int t;
@@ -42,27 +60,27 @@ static __inline__ void atomic_##op(int a, atomic_t *v)  
\
: "cc");\
 }  \
 
-#define ATOMIC_OP_RETURN(op, asm_op)   \
-static __inline__ int atomic_##op##_return(int a, atomic_t *v) \
+#define ATOMIC_OP_RETURN_RELAXED(op, asm_op)   \
+static inline int atomic_##op##_return_relaxed(int a, atomic_t *v) \
 {  \
int t;  \
\
__asm__ __volatile__(   \
-   PPC_ATOMIC_ENTRY_BARRIER\
-"1:lwarx   %0,0,%2 # atomic_" #op "_return\n"  \
-   #asm_op " %0,%1,%0\n"   \
-   PPC405_ERR77(0,%2)  \
-"  stwcx.  %0,0,%2 \n" \
+"1:lwarx   %0,0,%3 # atomic_" #op "_return_relaxed\n"  \
+   #asm_op " %0,%2,%0\n"   \
+   PPC405_ERR77(0, %3) \
+"  stwcx.  %0,0,%3\n"  \
 "  bne-1b\n"   \
-   PPC_ATOMIC_EXIT_BARRIER \
-   : "=" (t) \
+   : "=" (t), "+m" (v->counter)  \
: "r" (a), "r" (>counter)\
-   : "cc", "memory");  \
+   : "cc");\
\
return t;   \
 }
 
-#define ATOMIC_OPS(op, asm_op) ATOMIC_OP(op, asm_op) ATOMIC_OP_RETURN(op, 
asm_op)
+#define ATOMIC_OPS(op, asm_op) \
+   ATOMIC_OP(op, asm_op)   \
+   ATOMIC_OP_RETURN_RELAXED(op, asm_op)
 
 ATOMIC_OPS(add, add)
 ATOMIC_OPS(sub, subf)
@@ -71,8 +89,11 @@ ATOMIC_OP(and, and)
 ATOMIC_OP(or, or)
 ATOMIC_OP(xor, xor)
 
+#define atomic_add_return_relaxed atomic_add_return_relaxed
+#define atomic_sub_return_relaxed atomic_sub_return_relaxed
+
 #undef ATOMIC_OPS

Re: [PATCH 1/2] powerpc: sparse: static-ify some things

2016-01-05 Thread Andrew Donnellan

On 06/01/16 11:45, Daniel Axtens wrote:

As sparse suggests, these should be made static.

Signed-off-by: Daniel Axtens 


Reviewed-by: Andrew Donnellan 

--
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev