Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-06-13 Thread Michael Ellerman
Nick Desaulniers  writes:

> On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool
>  wrote:
>>
>> On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote:
>> > Segher, Cristophe, I suspect Clang is missing support for the %L and %U
>> > output templates [1].
...
>
> IIUC the bug report correctly, it looks like LLVM is failing for the
> __put_user_asm2_goto case for -m32.  A simple reproducer:
> https://godbolt.org/z/jBBF9b

If you add `-mregnames` you get register names:

https://godbolt.org/z/MxLjhF

foo:
stw %r3, 0(%r5)
stw %r4, 4(%r5)
blr


cheers


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-06-12 Thread Christophe Leroy




On 06/12/2020 09:33 PM, Nick Desaulniers wrote:


IIUC the bug report correctly, it looks like LLVM is failing for the
__put_user_asm2_goto case for -m32.  A simple reproducer:
https://godbolt.org/z/jBBF9b

void foo(long long in, long long* out) {
asm volatile(
   "stw%X1 %0, %1\n\t"
   "stw%X1 %L0, %L1"
   ::"r"(in), "m"(*out));
}
prints (in GCC):
foo:
   stw 3, 0(5)
   stw 4, 4(5)
   blr
(first time looking at ppc assembler, seems constants and registers
are not as easy to distinguish,
https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get
used to it." LOL, ok).


When I do ppc-linux-objdump -d vmlinux, registers and constants are 
easily distinguished, see below.


c0002284 :
c0002284:   3c 40 c0 3c lis r2,-16324
c0002288:   60 42 45 00 ori r2,r2,17664
c000228c:   3c 82 40 00 addis   r4,r2,16384
c0002290:   38 84 04 30 addir4,r4,1072
c0002294:   7c 93 43 a6 mtsprg  3,r4
c0002298:   3c 20 c0 3e lis r1,-16322
c000229c:   38 21 e0 00 addir1,r1,-8192
c00022a0:   38 00 00 00 li  r0,0
c00022a4:   94 01 1f f0 stwur0,8176(r1)
c00022a8:   48 35 e7 41 bl  c03609e8 
c00022ac:   38 60 00 00 li  r3,0
c00022b0:   7f e4 fb 78 mr  r4,r31
c00022b4:   48 35 e7 8d bl  c0360a40 
c00022b8:   48 35 eb e1 bl  c0360e98 
c00022bc:   3c c0 c0 3c lis r6,-16324
c00022c0:   3c c6 40 00 addis   r6,r6,16384
c00022c4:   7c df c3 a6 mtspr   799,r6
c00022c8:   3c 80 c0 00 lis r4,-16384
c00022cc:   60 84 22 e4 ori r4,r4,8932
c00022d0:   3c 84 40 00 addis   r4,r4,16384
c00022d4:   38 60 10 02 li  r3,4098
c00022d8:   7c 9a 03 a6 mtsrr0  r4
c00022dc:   7c 7b 03 a6 mtsrr1  r3
c00022e0:   4c 00 00 64 rfi
c00022e4:   7c 00 02 e4 tlbia
c00022e8:   7c 00 04 ac hwsync
c00022ec:   3c c6 c0 00 addis   r6,r6,-16384
c00022f0:   3c a0 c0 3c lis r5,-16324
c00022f4:   60 a5 40 00 ori r5,r5,16384
c00022f8:   90 a0 00 f0 stw r5,240(0)
c00022fc:   3c a5 40 00 addis   r5,r5,16384
c0002300:   90 c5 00 00 stw r6,0(r5)
c0002304:   38 80 10 32 li  r4,4146
c0002308:   3c 60 c0 35 lis r3,-16331
c000230c:   60 63 d6 a8 ori r3,r3,54952
c0002310:   7c 7a 03 a6 mtsrr0  r3
c0002314:   7c 9b 03 a6 mtsrr1  r4
c0002318:   4c 00 00 64 rfi

For GCC, I think you call tell you want register names with -mregnames

Christophe


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-06-12 Thread Segher Boessenkool
Hi!

On Fri, Jun 12, 2020 at 02:33:09PM -0700, Nick Desaulniers wrote:
> On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool
>  wrote:
> > The PowerPC part of
> > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
> > (sorry, no anchor) documents %U.
> 
> I thought those were constraints, not output templates?  Oh,
> The asm statement must also use %U as a placeholder for the
> “update” flag in the corresponding load or store instruction.
> got it.

Traditionally, *all* constraints were documented here, including the
ones that are only meant for GCC's internal use.  And the output
modifiers were largely not documented at all.

For GCC 10, for Power, I changed it to only document the constraints
that should be public in gcc.info (and everything in gccint.info).  The
output modifiers can neatly be documented here as well, since it such a
short section now.  We're not quite there yet, but getting there.

> > Traditionally the source code is the documentation for this.  The code
> > here starts with the comment
> >   /* Write second word of DImode or DFmode reference.  Works on register
> >  or non-indexed memory only.  */
> > (which is very out-of-date itself, it works fine for e.g. TImode as well,
> > but alas).
> >
> > Unit tests are completely unsuitable for most compiler things like this.
> 
> What? No, surely one may write tests for output operands.  Grepping
> for `%L` in gcc/ was less fun than I was hoping.

You should look for 'L' instead (incl. those quotes) ;-)

Unit tests are 100x as much work, and gets <5% of the problems, compared
to regression tests.  Unit tests only test the stuff you should have
written *anyway*.  It is much more useful to test that much higher level
things work, IMNSHO.

> > HtH,
> 
> Yes, perfect, thank you so much!  So it looks like LLVM does not yet
> handle %L properly for memory operands.
> https://bugs.llvm.org/show_bug.cgi?id=46186#c4
> It's neat to see how this is implemented in GCC (and how many aren't
> implemented in LLVM, yikes :( ).  For reference, this is implemented
> in PPCAsmPrinter::PrintAsmOperand() and
> PPCAsmPrinter::PrintAsmMemoryOperand() in
> llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp.  GCC switches first on the
> modifier characters, then the operand type.

That is what the rs6000 backend currently does, yeah.  The print_operand
function just gets passed the modifier character (as "int code", or 0 if
there is no modifier).  Since there are so many modifiers there aren't
really any better options than just doing a "switch (code)" around
everything else (well, things can be factored, some helper functions,
etc., but this is mostly very old code, and it has grown organically).

> LLVM dispatches on operand type, then modifier.

That is neater, certainly for REG operands.

> When I was looking into LLVM's AsmPrinter class,
> I was surprised to see it's basically an assembler that just has
> complex logic to just do a bunch of prints, so it makes sense to see
> that pattern in GCC literally calling printf.

GCC always outputs assembler code.  This is usually a big advantage, for
things like output_operand.

> Some things I don't understand from PPC parlance is the "mode"
> (preinc, predec, premodify) and small data operands?

"mode" is "machine mode" -- SImode and the like.  PRE_DEC etc. are
*codes* (rtx codes), like,  (mem:DF (pre_dec:SI (reg:SI 39)))  (straight
from the manual).

> IIUC the bug report correctly, it looks like LLVM is failing for the
> __put_user_asm2_goto case for -m32.  A simple reproducer:
> https://godbolt.org/z/jBBF9b
> 
> void foo(long long in, long long* out) {
> asm volatile(
>   "stw%X1 %0, %1\n\t"
>   "stw%X1 %L0, %L1"
>   ::"r"(in), "m"(*out));
> }

This is wrong if operands[0] is a register, btw.  So it should use 'o'
as constraint (not 'm'), and then the 'X' output modifier has become
useless.

> prints (in GCC):
> foo:
>   stw 3, 0(5)
>   stw 4, 4(5)
>   blr
> (first time looking at ppc assembler, seems constants and registers
> are not as easy to distinguish,

The instruction mnemonic always tells you what types all arguments are.
Traditionally we don't write spaces after commas, either.  That is
actually easier to read -- well, if you are used to it, anyway! :-)

> https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get
> used to it." LOL, ok).

Since quite a while you can write your assembler using register names as
well.  Not using the dangerous macros the Linux kernel had/has(with
which you can write "rN" in place of any "N", and it doesn't force you
to use the register name either, so you could write "li r3,r4" and
"mr r3,0" and even "addi r3,r0,1234", all very misleading).

> so that's "store word from register 3 into dereference of register 5
> plus 0, then store word from register 4 into dereference of register 5
> plus 4?"

Yup.

> Guessing the ppc32 abi is ILP32 putting long long's into two
> separate registers?

Yes, and the order is the same as it wou

Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-06-11 Thread Segher Boessenkool
On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote:
> Segher, Cristophe, I suspect Clang is missing support for the %L and %U
> output templates [1].

The arch/powerpc kernel first used the %U output modifier in 0c176fa80fdf
(from 2016), and %L in b8b572e1015f (2008).  include/asm-ppc (and ppc64)
have had %U since 2005 (1da177e4c3f4), and %L as well (0c541b4406a6).

> I've implemented support for some of these before
> in Clang via the documentation at [2], but these seem to be machine
> specific?

Yes, almost all output modifiers are.  Only %l, %a, %n, and part of %c
are generic (and %% and %= and on some targets, %{, %|, %}).

> Can you please point me to documentation/unit tests/source for
> these so that I can figure out what they should be doing, and look into
> implementing them in Clang?

The PowerPC part of
https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints
(sorry, no anchor) documents %U.

Traditionally the source code is the documentation for this.  The code
here starts with the comment
  /* Write second word of DImode or DFmode reference.  Works on register
 or non-indexed memory only.  */
(which is very out-of-date itself, it works fine for e.g. TImode as well,
but alas).

Unit tests are completely unsuitable for most compiler things like this.

The source code is gcc/config/rs6000/rs6000.c, easiest is to search for
'L' (with those quotes).  Function print_operand.

HtH,


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-28 Thread Michael Ellerman
On Fri, 2020-04-17 at 17:08:51 UTC, Christophe Leroy wrote:
> unsafe_put_user() is designed to take benefit of 'asm goto'.
> 
> Instead of using the standard __put_user() approach and branch
> based on the returned error, use 'asm goto' and make the
> exception code branch directly to the error label. There is
> no code anymore in the fixup section.
> 
> This change significantly simplifies functions using
> unsafe_put_user()
...
> 
> Signed-off-by: Christophe Leroy 
> Reviewed-by: Segher Boessenkool 

Applied to powerpc topic/uaccess-ppc, thanks.

https://git.kernel.org/powerpc/c/334710b1496af8a0960e70121f850e209c20958f

cheers


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-06 Thread Segher Boessenkool
On Wed, May 06, 2020 at 08:10:57PM +0200, Christophe Leroy wrote:
> Le 06/05/2020 à 19:58, Segher Boessenkool a écrit :
> >>  #define __put_user_asm_goto(x, addr, label, op)   \
> >>asm volatile goto(  \
> >>-   "1: " op "%U1%X1 %0,%1  # put_user\n"   \
> >>+   "1: " op "%X1 %0,%1 # put_user\n"   \
> >>EX_TABLE(1b, %l2)   \
> >>:   \
> >>-   : "r" (x), "m<>" (*addr)\
> >>+   : "r" (x), "m" (*addr)  \
> >>:   \
> >>: label)
> >
> >Like that.  But you will have to do that to *all* places we use the "<>"
> >constraints, or wait for more stuff to fail?  And, there probably are
> >places we *do* want update form insns used (they do help in some loops,
> >for example)?
> >
> 
> AFAICT, git grep "m<>" provides no result.

Ah, okay.

> However many places have %Ux:



Yes, all of those are from when "m" still meant what "m<>" is now.  For
seeing how many update form insns can be generated (and how much that
matters), these all should be fixed, at a minimum.


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-06 Thread Christophe Leroy




Le 06/05/2020 à 19:58, Segher Boessenkool a écrit :

On Wed, May 06, 2020 at 10:58:55AM +1000, Michael Ellerman wrote:

The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.


[ You shouldn't use 4.6.3, there has been 4.6.4 since a while.  And 4.6
   is nine years old now.  Most projects do not support < 4.8 anymore, on
   any architecture.  ]


Moving up to 4.6.4 wouldn't actually help with this though would it?


Nope.  But 4.6.4 is a bug-fix release, 91 bugs fixed since 4.6.3, so you
should switch to it if you can :-)


Also I have 4.6.3 compilers already built, I don't really have time to
rebuild them for 4.6.4.

The kernel has a top-level minimum version, which I'm not in charge of, see:

https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc


Yes, I know.  And it is much preferred not to have stricter requirements
for Power, I know that too.  Something has to give though :-/


There were discussions about making 4.8 the minimum, but I'm not sure
where they got to.


Yeah, just petered out I think?

All significant distros come with a 4.8 as system compiler.


Plain "m" works, how much does the "<>" affect code gen in practice?

A quick diff here shows no difference from removing "<>".


It will make it impossible to use update-form instructions here.  That
probably does not matter much at all, in this case.

If you remove the "<>" constraints, also remove the "%Un" output modifier?


So like this?

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 62cc8d7640ec..ca847aed8e45 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -207,10 +207,10 @@ do {  
\
  
  #define __put_user_asm_goto(x, addr, label, op)			\

asm volatile goto(  \
-   "1:" op "%U1%X1 %0,%1# put_user\n"  \
+   "1:" op "%X1 %0,%1   # put_user\n"  \
EX_TABLE(1b, %l2)   \
:   \
-   : "r" (x), "m<>" (*addr)  \
+   : "r" (x), "m" (*addr)  \
:   \
: label)


Like that.  But you will have to do that to *all* places we use the "<>"
constraints, or wait for more stuff to fail?  And, there probably are
places we *do* want update form insns used (they do help in some loops,
for example)?



AFAICT, git grep "m<>" provides no result.

However many places have %Ux:

arch/powerpc/boot/io.h:	__asm__ __volatile__("lbz%U1%X1 %0,%1; twi 
0,%0,0; isync"

arch/powerpc/boot/io.h: __asm__ __volatile__("stb%U0%X0 %1,%0; sync"
arch/powerpc/boot/io.h:	__asm__ __volatile__("lhz%U1%X1 %0,%1; twi 
0,%0,0; isync"

arch/powerpc/boot/io.h: __asm__ __volatile__("sth%U0%X0 %1,%0; sync"
arch/powerpc/boot/io.h:	__asm__ __volatile__("lwz%U1%X1 %0,%1; twi 
0,%0,0; isync"

arch/powerpc/boot/io.h: __asm__ __volatile__("stw%U0%X0 %1,%0; sync"
arch/powerpc/include/asm/atomic.h:	__asm__ __volatile__("lwz%U1%X1 
%0,%1" : "=r"(t) : "m"(v->counter));
arch/powerpc/include/asm/atomic.h:	__asm__ __volatile__("stw%U0%X0 
%1,%0" : "=m"(v->counter) : "r"(i));
arch/powerpc/include/asm/atomic.h:	__asm__ __volatile__("ld%U1%X1 %0,%1" 
: "=r"(t) : "m"(v->counter));
arch/powerpc/include/asm/atomic.h:	__asm__ __volatile__("std%U0%X0 
%1,%0" : "=m"(v->counter) : "r"(i));

arch/powerpc/include/asm/book3s/32/pgtable.h:   stw%U0%X0 %2,%0\n\
arch/powerpc/include/asm/book3s/32/pgtable.h:   stw%U0%X0 %L2,%1"
arch/powerpc/include/asm/io.h:	__asm__ __volatile__("sync;"#insn"%U1%X1 
%0,%1;twi 0,%0,0;isync"\
arch/powerpc/include/asm/io.h:	__asm__ __volatile__("sync;"#insn"%U0%X0 
%1,%0"			\

arch/powerpc/include/asm/nohash/pgtable.h:  stw%U0%X0 
%2,%0\n\
arch/powerpc/include/asm/nohash/pgtable.h:  stw%U0%X0 
%L2,%1"
arch/powerpc/kvm/powerpc.c:	asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : 
"=m" (fprd) : "m" (fprs)
arch/powerpc/kvm/powerpc.c:	asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : 
"=m" (fprs) : "m" (fprd)



Christophe


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-06 Thread Segher Boessenkool
On Wed, May 06, 2020 at 11:36:00AM +1000, Michael Ellerman wrote:
> >> As far as I understood that's mandatory on recent gcc to get the 
> >> pre-update form of the instruction. With older versions "m" was doing 
> >> the same, but not anymore.
> >
> > Yes.  How much that matters depends on the asm.  On older CPUs (6xx/7xx,
> > say) the update form was just as fast as the non-update form.  On newer
> > or bigger CPUs it is usually executed just the same as an add followed
> > by the memory access, so it just saves a bit of code size.
> 
> The update-forms are stdux, sthux etc. right?

And stdu, sthu, etc.  "x" is "indexed form" (reg+reg addressing).

> I don't see any change in the number of those with or without the
> constraint. That's using GCC 9.3.0.

It's most useful in loops (and happens more often there).  You probably
do not have many loops that allow update form insns.

> >> Should we ifdef the "m<>" or "m" based on GCC 
> >> version ?
> >
> > That will be a lot of churn.  Just make 4.8 minimum?
> 
> As I said in my other mail that's not really up to us. We could mandate
> a higher minimum for powerpc, but I'd rather not.

Yeah, I quite understand that.

> I think for now I'm inclined to just drop the "<>", and we can revisit
> in a release or two when hopefully GCC 4.8 has become the minimum.

An unhappy resolution, but it leaves a light on the horizon :-)

In that case, leave the "%Un", if you will but the "<>" back soonish?
Not much point removing it and putting it back later (it is harmless,
there are more instances of it in the kernel as well, since many years).

Thanks!


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-06 Thread Segher Boessenkool
On Wed, May 06, 2020 at 10:58:55AM +1000, Michael Ellerman wrote:
> >> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
> >
> > [ You shouldn't use 4.6.3, there has been 4.6.4 since a while.  And 4.6
> >   is nine years old now.  Most projects do not support < 4.8 anymore, on
> >   any architecture.  ]
> 
> Moving up to 4.6.4 wouldn't actually help with this though would it?

Nope.  But 4.6.4 is a bug-fix release, 91 bugs fixed since 4.6.3, so you
should switch to it if you can :-)

> Also I have 4.6.3 compilers already built, I don't really have time to
> rebuild them for 4.6.4.
> 
> The kernel has a top-level minimum version, which I'm not in charge of, see:
> 
> https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc

Yes, I know.  And it is much preferred not to have stricter requirements
for Power, I know that too.  Something has to give though :-/

> There were discussions about making 4.8 the minimum, but I'm not sure
> where they got to.

Yeah, just petered out I think?

All significant distros come with a 4.8 as system compiler.

> >> Plain "m" works, how much does the "<>" affect code gen in practice?
> >> 
> >> A quick diff here shows no difference from removing "<>".
> >
> > It will make it impossible to use update-form instructions here.  That
> > probably does not matter much at all, in this case.
> >
> > If you remove the "<>" constraints, also remove the "%Un" output modifier?
> 
> So like this?
> 
> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index 62cc8d7640ec..ca847aed8e45 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -207,10 +207,10 @@ do {
> \
>  
>  #define __put_user_asm_goto(x, addr, label, op)  \
>   asm volatile goto(  \
> - "1: " op "%U1%X1 %0,%1  # put_user\n"   \
> + "1: " op "%X1 %0,%1 # put_user\n"   \
>   EX_TABLE(1b, %l2)   \
>   :   \
> - : "r" (x), "m<>" (*addr)\
> + : "r" (x), "m" (*addr)  \
>   :   \
>   : label)

Like that.  But you will have to do that to *all* places we use the "<>"
constraints, or wait for more stuff to fail?  And, there probably are
places we *do* want update form insns used (they do help in some loops,
for example)?


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Michael Ellerman
Segher Boessenkool  writes:
> On Tue, May 05, 2020 at 05:40:21PM +0200, Christophe Leroy wrote:
>> >>+#define __put_user_asm_goto(x, addr, label, op)   \
>> >>+  asm volatile goto(  \
>> >>+  "1: " op "%U1%X1 %0,%1  # put_user\n"   \
>> >>+  EX_TABLE(1b, %l2)   \
>> >>+  :   \
>> >>+  : "r" (x), "m<>" (*addr)\
>> >
>> >The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
>> >
>> >Plain "m" works, how much does the "<>" affect code gen in practice?
>> >
>> >A quick diff here shows no difference from removing "<>".
>> 
>> It was recommended by Segher, there has been some discussion about it on 
>> v1 of this patch, see 
>> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/
>> 
>> As far as I understood that's mandatory on recent gcc to get the 
>> pre-update form of the instruction. With older versions "m" was doing 
>> the same, but not anymore.
>
> Yes.  How much that matters depends on the asm.  On older CPUs (6xx/7xx,
> say) the update form was just as fast as the non-update form.  On newer
> or bigger CPUs it is usually executed just the same as an add followed
> by the memory access, so it just saves a bit of code size.

The update-forms are stdux, sthux etc. right?

I don't see any change in the number of those with or without the
constraint. That's using GCC 9.3.0.

>> Should we ifdef the "m<>" or "m" based on GCC 
>> version ?
>
> That will be a lot of churn.  Just make 4.8 minimum?

As I said in my other mail that's not really up to us. We could mandate
a higher minimum for powerpc, but I'd rather not.

I think for now I'm inclined to just drop the "<>", and we can revisit
in a release or two when hopefully GCC 4.8 has become the minimum.

cheers


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Michael Ellerman
Segher Boessenkool  writes:
> Hi!
>
> On Wed, May 06, 2020 at 12:27:58AM +1000, Michael Ellerman wrote:
>> Christophe Leroy  writes:
>> > unsafe_put_user() is designed to take benefit of 'asm goto'.
>> >
>> > Instead of using the standard __put_user() approach and branch
>> > based on the returned error, use 'asm goto' and make the
>> > exception code branch directly to the error label. There is
>> > no code anymore in the fixup section.
>> >
>> > This change significantly simplifies functions using
>> > unsafe_put_user()
>> >
>> ...
>> >
>> > Signed-off-by: Christophe Leroy 
>> > ---
>> >  arch/powerpc/include/asm/uaccess.h | 61 +-
>> >  1 file changed, 52 insertions(+), 9 deletions(-)
>> >
>> > diff --git a/arch/powerpc/include/asm/uaccess.h 
>> > b/arch/powerpc/include/asm/uaccess.h
>> > index 9cc9c106ae2a..9365b59495a2 100644
>> > --- a/arch/powerpc/include/asm/uaccess.h
>> > +++ b/arch/powerpc/include/asm/uaccess.h
>> > @@ -196,6 +193,52 @@ do {  
>> > \
>> >  })
>> >  
>> >  
>> > +#define __put_user_asm_goto(x, addr, label, op)   \
>> > +  asm volatile goto(  \
>> > +  "1: " op "%U1%X1 %0,%1  # put_user\n"   \
>> > +  EX_TABLE(1b, %l2)   \
>> > +  :   \
>> > +  : "r" (x), "m<>" (*addr)\
>> 
>> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
>
> [ You shouldn't use 4.6.3, there has been 4.6.4 since a while.  And 4.6
>   is nine years old now.  Most projects do not support < 4.8 anymore, on
>   any architecture.  ]

Moving up to 4.6.4 wouldn't actually help with this though would it?

Also I have 4.6.3 compilers already built, I don't really have time to
rebuild them for 4.6.4.

The kernel has a top-level minimum version, which I'm not in charge of, see:

https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc


There were discussions about making 4.8 the minimum, but I'm not sure
where they got to.

>> Plain "m" works, how much does the "<>" affect code gen in practice?
>> 
>> A quick diff here shows no difference from removing "<>".
>
> It will make it impossible to use update-form instructions here.  That
> probably does not matter much at all, in this case.
>
> If you remove the "<>" constraints, also remove the "%Un" output modifier?

So like this?

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 62cc8d7640ec..ca847aed8e45 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -207,10 +207,10 @@ do {  
\
 
 #define __put_user_asm_goto(x, addr, label, op)\
asm volatile goto(  \
-   "1: " op "%U1%X1 %0,%1  # put_user\n"   \
+   "1: " op "%X1 %0,%1 # put_user\n"   \
EX_TABLE(1b, %l2)   \
:   \
-   : "r" (x), "m<>" (*addr)\
+   : "r" (x), "m" (*addr)  \
:   \
: label)
 


cheers


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Christophe Leroy

Hi,

Le 05/05/2020 à 16:27, Michael Ellerman a écrit :

Christophe Leroy  writes:

unsafe_put_user() is designed to take benefit of 'asm goto'.

Instead of using the standard __put_user() approach and branch
based on the returned error, use 'asm goto' and make the
exception code branch directly to the error label. There is
no code anymore in the fixup section.

This change significantly simplifies functions using
unsafe_put_user()


...


Signed-off-by: Christophe Leroy 
---
  arch/powerpc/include/asm/uaccess.h | 61 +-
  1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 9cc9c106ae2a..9365b59495a2 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -196,6 +193,52 @@ do {   
\
  })
  
  
+#define __put_user_asm_goto(x, addr, label, op)			\

+   asm volatile goto(  \
+   "1:" op "%U1%X1 %0,%1# put_user\n"  \
+   EX_TABLE(1b, %l2)   \
+   :   \
+   : "r" (x), "m<>" (*addr)  \


The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.

Plain "m" works, how much does the "<>" affect code gen in practice?

A quick diff here shows no difference from removing "<>".


It was recommended by Segher, there has been some discussion about it on 
v1 of this patch, see 
https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/


As far as I understood that's mandatory on recent gcc to get the 
pre-update form of the instruction. With older versions "m" was doing 
the same, but not anymore. Should we ifdef the "m<>" or "m" based on GCC 
version ?


Christophe


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Segher Boessenkool
On Tue, May 05, 2020 at 05:40:21PM +0200, Christophe Leroy wrote:
> >>+#define __put_user_asm_goto(x, addr, label, op)\
> >>+   asm volatile goto(  \
> >>+   "1: " op "%U1%X1 %0,%1  # put_user\n"   \
> >>+   EX_TABLE(1b, %l2)   \
> >>+   :   \
> >>+   : "r" (x), "m<>" (*addr)\
> >
> >The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.
> >
> >Plain "m" works, how much does the "<>" affect code gen in practice?
> >
> >A quick diff here shows no difference from removing "<>".
> 
> It was recommended by Segher, there has been some discussion about it on 
> v1 of this patch, see 
> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/
> 
> As far as I understood that's mandatory on recent gcc to get the 
> pre-update form of the instruction. With older versions "m" was doing 
> the same, but not anymore.

Yes.  How much that matters depends on the asm.  On older CPUs (6xx/7xx,
say) the update form was just as fast as the non-update form.  On newer
or bigger CPUs it is usually executed just the same as an add followed
by the memory access, so it just saves a bit of code size.

> Should we ifdef the "m<>" or "m" based on GCC 
> version ?

That will be a lot of churn.  Just make 4.8 minimum?


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Segher Boessenkool
Hi!

On Wed, May 06, 2020 at 12:27:58AM +1000, Michael Ellerman wrote:
> Christophe Leroy  writes:
> > unsafe_put_user() is designed to take benefit of 'asm goto'.
> >
> > Instead of using the standard __put_user() approach and branch
> > based on the returned error, use 'asm goto' and make the
> > exception code branch directly to the error label. There is
> > no code anymore in the fixup section.
> >
> > This change significantly simplifies functions using
> > unsafe_put_user()
> >
> ...
> >
> > Signed-off-by: Christophe Leroy 
> > ---
> >  arch/powerpc/include/asm/uaccess.h | 61 +-
> >  1 file changed, 52 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/uaccess.h 
> > b/arch/powerpc/include/asm/uaccess.h
> > index 9cc9c106ae2a..9365b59495a2 100644
> > --- a/arch/powerpc/include/asm/uaccess.h
> > +++ b/arch/powerpc/include/asm/uaccess.h
> > @@ -196,6 +193,52 @@ do {   
> > \
> >  })
> >  
> >  
> > +#define __put_user_asm_goto(x, addr, label, op)\
> > +   asm volatile goto(  \
> > +   "1: " op "%U1%X1 %0,%1  # put_user\n"   \
> > +   EX_TABLE(1b, %l2)   \
> > +   :   \
> > +   : "r" (x), "m<>" (*addr)\
> 
> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.

[ You shouldn't use 4.6.3, there has been 4.6.4 since a while.  And 4.6
  is nine years old now.  Most projects do not support < 4.8 anymore, on
  any architecture.  ]

> Plain "m" works, how much does the "<>" affect code gen in practice?
> 
> A quick diff here shows no difference from removing "<>".

It will make it impossible to use update-form instructions here.  That
probably does not matter much at all, in this case.

If you remove the "<>" constraints, also remove the "%Un" output modifier?


Segher


Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-05-05 Thread Michael Ellerman
Christophe Leroy  writes:
> unsafe_put_user() is designed to take benefit of 'asm goto'.
>
> Instead of using the standard __put_user() approach and branch
> based on the returned error, use 'asm goto' and make the
> exception code branch directly to the error label. There is
> no code anymore in the fixup section.
>
> This change significantly simplifies functions using
> unsafe_put_user()
>
...
>
> Signed-off-by: Christophe Leroy 
> ---
>  arch/powerpc/include/asm/uaccess.h | 61 +-
>  1 file changed, 52 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/uaccess.h 
> b/arch/powerpc/include/asm/uaccess.h
> index 9cc9c106ae2a..9365b59495a2 100644
> --- a/arch/powerpc/include/asm/uaccess.h
> +++ b/arch/powerpc/include/asm/uaccess.h
> @@ -196,6 +193,52 @@ do { 
> \
>  })
>  
>  
> +#define __put_user_asm_goto(x, addr, label, op)  \
> + asm volatile goto(  \
> + "1: " op "%U1%X1 %0,%1  # put_user\n"   \
> + EX_TABLE(1b, %l2)   \
> + :   \
> + : "r" (x), "m<>" (*addr)\

The "m<>" here is breaking GCC 4.6.3, which we allegedly still support.

Plain "m" works, how much does the "<>" affect code gen in practice?

A quick diff here shows no difference from removing "<>".

cheers


[PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'

2020-04-17 Thread Christophe Leroy
unsafe_put_user() is designed to take benefit of 'asm goto'.

Instead of using the standard __put_user() approach and branch
based on the returned error, use 'asm goto' and make the
exception code branch directly to the error label. There is
no code anymore in the fixup section.

This change significantly simplifies functions using
unsafe_put_user()

Small exemple of the benefit with the following code:

struct test {
u32 item1;
u16 item2;
u8 item3;
u64 item4;
};

int set_test_to_user(struct test __user *test, u32 item1, u16 item2, u8 item3, 
u64 item4)
{
unsafe_put_user(item1, &test->item1, failed);
unsafe_put_user(item2, &test->item2, failed);
unsafe_put_user(item3, &test->item3, failed);
unsafe_put_user(item4, &test->item4, failed);
return 0;
failed:
return -EFAULT;
}

Before the patch:

0be8 :
 be8:   39 20 00 00 li  r9,0
 bec:   90 83 00 00 stw r4,0(r3)
 bf0:   2f 89 00 00 cmpwi   cr7,r9,0
 bf4:   40 9e 00 38 bne cr7,c2c 
 bf8:   b0 a3 00 04 sth r5,4(r3)
 bfc:   2f 89 00 00 cmpwi   cr7,r9,0
 c00:   40 9e 00 2c bne cr7,c2c 
 c04:   98 c3 00 06 stb r6,6(r3)
 c08:   2f 89 00 00 cmpwi   cr7,r9,0
 c0c:   40 9e 00 20 bne cr7,c2c 
 c10:   90 e3 00 08 stw r7,8(r3)
 c14:   91 03 00 0c stw r8,12(r3)
 c18:   21 29 00 00 subfic  r9,r9,0
 c1c:   7d 29 49 10 subfe   r9,r9,r9
 c20:   38 60 ff f2 li  r3,-14
 c24:   7d 23 18 38 and r3,r9,r3
 c28:   4e 80 00 20 blr
 c2c:   38 60 ff f2 li  r3,-14
 c30:   4e 80 00 20 blr

 <.fixup>:
...
  b8:   39 20 ff f2 li  r9,-14
  bc:   48 00 00 00 b   bc <.fixup+0xbc>
bc: R_PPC_REL24 .text+0xbf0
  c0:   39 20 ff f2 li  r9,-14
  c4:   48 00 00 00 b   c4 <.fixup+0xc4>
c4: R_PPC_REL24 .text+0xbfc
  c8:   39 20 ff f2 li  r9,-14
  cc:   48 00 00 00 b   cc <.fixup+0xcc>
cc: R_PPC_REL24 .text+0xc08
  d0:   39 20 ff f2 li  r9,-14
  d4:   48 00 00 00 b   d4 <.fixup+0xd4>
d4: R_PPC_REL24 .text+0xc18

 <__ex_table>:
...
a0: R_PPC_REL32 .text+0xbec
a4: R_PPC_REL32 .fixup+0xb8
a8: R_PPC_REL32 .text+0xbf8
ac: R_PPC_REL32 .fixup+0xc0
b0: R_PPC_REL32 .text+0xc04
b4: R_PPC_REL32 .fixup+0xc8
b8: R_PPC_REL32 .text+0xc10
bc: R_PPC_REL32 .fixup+0xd0
c0: R_PPC_REL32 .text+0xc14
c4: R_PPC_REL32 .fixup+0xd0

After the patch:

0be8 :
 be8:   90 83 00 00 stw r4,0(r3)
 bec:   b0 a3 00 04 sth r5,4(r3)
 bf0:   98 c3 00 06 stb r6,6(r3)
 bf4:   90 e3 00 08 stw r7,8(r3)
 bf8:   91 03 00 0c stw r8,12(r3)
 bfc:   38 60 00 00 li  r3,0
 c00:   4e 80 00 20 blr
 c04:   38 60 ff f2 li  r3,-14
 c08:   4e 80 00 20 blr

 <__ex_table>:
...
a0: R_PPC_REL32 .text+0xbe8
a4: R_PPC_REL32 .text+0xc04
a8: R_PPC_REL32 .text+0xbec
ac: R_PPC_REL32 .text+0xc04
b0: R_PPC_REL32 .text+0xbf0
b4: R_PPC_REL32 .text+0xc04
b8: R_PPC_REL32 .text+0xbf4
bc: R_PPC_REL32 .text+0xc04
c0: R_PPC_REL32 .text+0xbf8
c4: R_PPC_REL32 .text+0xc04

Signed-off-by: Christophe Leroy 
Reviewed-by: Segher Boessenkool 
---
v4:
- no change

v3:
- Added <> modifier to __put_user_asm_goto()
- Removed %U1 modifier to __put_user_asm2_goto()

v2:
- Grouped most __goto() macros together
- Removed stuff in .fixup section, referencing the error label
directly from the extable
- Using more flexible addressing in asm.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/uaccess.h | 61 +-
 1 file changed, 52 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 9cc9c106ae2a..9365b59495a2 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -93,12 +93,12 @@ static inline int __access_ok(unsigned long addr, unsigned 
long size,
 #define __get_user(x, ptr) \
__get_user_nocheck((x), (ptr), sizeof(*(ptr)), true)
 #define __put_user(x, ptr) \
-   __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), true)
+   __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)))
+#define __put_user_goto(x, ptr, label) \
+   __put_user_nocheck_goto((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
label)
 
 #define __get_user_allowed(x, ptr) \
__get_user_no