Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Nick Desaulniers writes: > On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool > wrote: >> >> On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote: >> > Segher, Cristophe, I suspect Clang is missing support for the %L and %U >> > output templates [1]. ... > > IIUC the bug report correctly, it looks like LLVM is failing for the > __put_user_asm2_goto case for -m32. A simple reproducer: > https://godbolt.org/z/jBBF9b If you add `-mregnames` you get register names: https://godbolt.org/z/MxLjhF foo: stw %r3, 0(%r5) stw %r4, 4(%r5) blr cheers
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On 06/12/2020 09:33 PM, Nick Desaulniers wrote: IIUC the bug report correctly, it looks like LLVM is failing for the __put_user_asm2_goto case for -m32. A simple reproducer: https://godbolt.org/z/jBBF9b void foo(long long in, long long* out) { asm volatile( "stw%X1 %0, %1\n\t" "stw%X1 %L0, %L1" ::"r"(in), "m"(*out)); } prints (in GCC): foo: stw 3, 0(5) stw 4, 4(5) blr (first time looking at ppc assembler, seems constants and registers are not as easy to distinguish, https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get used to it." LOL, ok). When I do ppc-linux-objdump -d vmlinux, registers and constants are easily distinguished, see below. c0002284 : c0002284: 3c 40 c0 3c lis r2,-16324 c0002288: 60 42 45 00 ori r2,r2,17664 c000228c: 3c 82 40 00 addis r4,r2,16384 c0002290: 38 84 04 30 addir4,r4,1072 c0002294: 7c 93 43 a6 mtsprg 3,r4 c0002298: 3c 20 c0 3e lis r1,-16322 c000229c: 38 21 e0 00 addir1,r1,-8192 c00022a0: 38 00 00 00 li r0,0 c00022a4: 94 01 1f f0 stwur0,8176(r1) c00022a8: 48 35 e7 41 bl c03609e8 c00022ac: 38 60 00 00 li r3,0 c00022b0: 7f e4 fb 78 mr r4,r31 c00022b4: 48 35 e7 8d bl c0360a40 c00022b8: 48 35 eb e1 bl c0360e98 c00022bc: 3c c0 c0 3c lis r6,-16324 c00022c0: 3c c6 40 00 addis r6,r6,16384 c00022c4: 7c df c3 a6 mtspr 799,r6 c00022c8: 3c 80 c0 00 lis r4,-16384 c00022cc: 60 84 22 e4 ori r4,r4,8932 c00022d0: 3c 84 40 00 addis r4,r4,16384 c00022d4: 38 60 10 02 li r3,4098 c00022d8: 7c 9a 03 a6 mtsrr0 r4 c00022dc: 7c 7b 03 a6 mtsrr1 r3 c00022e0: 4c 00 00 64 rfi c00022e4: 7c 00 02 e4 tlbia c00022e8: 7c 00 04 ac hwsync c00022ec: 3c c6 c0 00 addis r6,r6,-16384 c00022f0: 3c a0 c0 3c lis r5,-16324 c00022f4: 60 a5 40 00 ori r5,r5,16384 c00022f8: 90 a0 00 f0 stw r5,240(0) c00022fc: 3c a5 40 00 addis r5,r5,16384 c0002300: 90 c5 00 00 stw r6,0(r5) c0002304: 38 80 10 32 li r4,4146 c0002308: 3c 60 c0 35 lis r3,-16331 c000230c: 60 63 d6 a8 ori r3,r3,54952 c0002310: 7c 7a 03 a6 mtsrr0 r3 c0002314: 7c 9b 03 a6 mtsrr1 r4 c0002318: 4c 00 00 64 rfi For GCC, I think you call tell you want register names with -mregnames Christophe
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Hi! On Fri, Jun 12, 2020 at 02:33:09PM -0700, Nick Desaulniers wrote: > On Thu, Jun 11, 2020 at 4:53 PM Segher Boessenkool > wrote: > > The PowerPC part of > > https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints > > (sorry, no anchor) documents %U. > > I thought those were constraints, not output templates? Oh, > The asm statement must also use %U as a placeholder for the > “update” flag in the corresponding load or store instruction. > got it. Traditionally, *all* constraints were documented here, including the ones that are only meant for GCC's internal use. And the output modifiers were largely not documented at all. For GCC 10, for Power, I changed it to only document the constraints that should be public in gcc.info (and everything in gccint.info). The output modifiers can neatly be documented here as well, since it such a short section now. We're not quite there yet, but getting there. > > Traditionally the source code is the documentation for this. The code > > here starts with the comment > > /* Write second word of DImode or DFmode reference. Works on register > > or non-indexed memory only. */ > > (which is very out-of-date itself, it works fine for e.g. TImode as well, > > but alas). > > > > Unit tests are completely unsuitable for most compiler things like this. > > What? No, surely one may write tests for output operands. Grepping > for `%L` in gcc/ was less fun than I was hoping. You should look for 'L' instead (incl. those quotes) ;-) Unit tests are 100x as much work, and gets <5% of the problems, compared to regression tests. Unit tests only test the stuff you should have written *anyway*. It is much more useful to test that much higher level things work, IMNSHO. > > HtH, > > Yes, perfect, thank you so much! So it looks like LLVM does not yet > handle %L properly for memory operands. > https://bugs.llvm.org/show_bug.cgi?id=46186#c4 > It's neat to see how this is implemented in GCC (and how many aren't > implemented in LLVM, yikes :( ). For reference, this is implemented > in PPCAsmPrinter::PrintAsmOperand() and > PPCAsmPrinter::PrintAsmMemoryOperand() in > llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp. GCC switches first on the > modifier characters, then the operand type. That is what the rs6000 backend currently does, yeah. The print_operand function just gets passed the modifier character (as "int code", or 0 if there is no modifier). Since there are so many modifiers there aren't really any better options than just doing a "switch (code)" around everything else (well, things can be factored, some helper functions, etc., but this is mostly very old code, and it has grown organically). > LLVM dispatches on operand type, then modifier. That is neater, certainly for REG operands. > When I was looking into LLVM's AsmPrinter class, > I was surprised to see it's basically an assembler that just has > complex logic to just do a bunch of prints, so it makes sense to see > that pattern in GCC literally calling printf. GCC always outputs assembler code. This is usually a big advantage, for things like output_operand. > Some things I don't understand from PPC parlance is the "mode" > (preinc, predec, premodify) and small data operands? "mode" is "machine mode" -- SImode and the like. PRE_DEC etc. are *codes* (rtx codes), like, (mem:DF (pre_dec:SI (reg:SI 39))) (straight from the manual). > IIUC the bug report correctly, it looks like LLVM is failing for the > __put_user_asm2_goto case for -m32. A simple reproducer: > https://godbolt.org/z/jBBF9b > > void foo(long long in, long long* out) { > asm volatile( > "stw%X1 %0, %1\n\t" > "stw%X1 %L0, %L1" > ::"r"(in), "m"(*out)); > } This is wrong if operands[0] is a register, btw. So it should use 'o' as constraint (not 'm'), and then the 'X' output modifier has become useless. > prints (in GCC): > foo: > stw 3, 0(5) > stw 4, 4(5) > blr > (first time looking at ppc assembler, seems constants and registers > are not as easy to distinguish, The instruction mnemonic always tells you what types all arguments are. Traditionally we don't write spaces after commas, either. That is actually easier to read -- well, if you are used to it, anyway! :-) > https://developer.ibm.com/technologies/linux/articles/l-ppc/ say "Get > used to it." LOL, ok). Since quite a while you can write your assembler using register names as well. Not using the dangerous macros the Linux kernel had/has(with which you can write "rN" in place of any "N", and it doesn't force you to use the register name either, so you could write "li r3,r4" and "mr r3,0" and even "addi r3,r0,1234", all very misleading). > so that's "store word from register 3 into dereference of register 5 > plus 0, then store word from register 4 into dereference of register 5 > plus 4?" Yup. > Guessing the ppc32 abi is ILP32 putting long long's into two > separate registers? Yes, and the order is the same as it wou
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Thu, Jun 11, 2020 at 03:43:55PM -0700, Nick Desaulniers wrote: > Segher, Cristophe, I suspect Clang is missing support for the %L and %U > output templates [1]. The arch/powerpc kernel first used the %U output modifier in 0c176fa80fdf (from 2016), and %L in b8b572e1015f (2008). include/asm-ppc (and ppc64) have had %U since 2005 (1da177e4c3f4), and %L as well (0c541b4406a6). > I've implemented support for some of these before > in Clang via the documentation at [2], but these seem to be machine > specific? Yes, almost all output modifiers are. Only %l, %a, %n, and part of %c are generic (and %% and %= and on some targets, %{, %|, %}). > Can you please point me to documentation/unit tests/source for > these so that I can figure out what they should be doing, and look into > implementing them in Clang? The PowerPC part of https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints (sorry, no anchor) documents %U. Traditionally the source code is the documentation for this. The code here starts with the comment /* Write second word of DImode or DFmode reference. Works on register or non-indexed memory only. */ (which is very out-of-date itself, it works fine for e.g. TImode as well, but alas). Unit tests are completely unsuitable for most compiler things like this. The source code is gcc/config/rs6000/rs6000.c, easiest is to search for 'L' (with those quotes). Function print_operand. HtH, Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Fri, 2020-04-17 at 17:08:51 UTC, Christophe Leroy wrote: > unsafe_put_user() is designed to take benefit of 'asm goto'. > > Instead of using the standard __put_user() approach and branch > based on the returned error, use 'asm goto' and make the > exception code branch directly to the error label. There is > no code anymore in the fixup section. > > This change significantly simplifies functions using > unsafe_put_user() ... > > Signed-off-by: Christophe Leroy > Reviewed-by: Segher Boessenkool Applied to powerpc topic/uaccess-ppc, thanks. https://git.kernel.org/powerpc/c/334710b1496af8a0960e70121f850e209c20958f cheers
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Wed, May 06, 2020 at 08:10:57PM +0200, Christophe Leroy wrote: > Le 06/05/2020 à 19:58, Segher Boessenkool a écrit : > >> #define __put_user_asm_goto(x, addr, label, op) \ > >>asm volatile goto( \ > >>- "1: " op "%U1%X1 %0,%1 # put_user\n" \ > >>+ "1: " op "%X1 %0,%1 # put_user\n" \ > >>EX_TABLE(1b, %l2) \ > >>: \ > >>- : "r" (x), "m<>" (*addr)\ > >>+ : "r" (x), "m" (*addr) \ > >>: \ > >>: label) > > > >Like that. But you will have to do that to *all* places we use the "<>" > >constraints, or wait for more stuff to fail? And, there probably are > >places we *do* want update form insns used (they do help in some loops, > >for example)? > > > > AFAICT, git grep "m<>" provides no result. Ah, okay. > However many places have %Ux: Yes, all of those are from when "m" still meant what "m<>" is now. For seeing how many update form insns can be generated (and how much that matters), these all should be fixed, at a minimum. Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Le 06/05/2020 à 19:58, Segher Boessenkool a écrit : On Wed, May 06, 2020 at 10:58:55AM +1000, Michael Ellerman wrote: The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. [ You shouldn't use 4.6.3, there has been 4.6.4 since a while. And 4.6 is nine years old now. Most projects do not support < 4.8 anymore, on any architecture. ] Moving up to 4.6.4 wouldn't actually help with this though would it? Nope. But 4.6.4 is a bug-fix release, 91 bugs fixed since 4.6.3, so you should switch to it if you can :-) Also I have 4.6.3 compilers already built, I don't really have time to rebuild them for 4.6.4. The kernel has a top-level minimum version, which I'm not in charge of, see: https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc Yes, I know. And it is much preferred not to have stricter requirements for Power, I know that too. Something has to give though :-/ There were discussions about making 4.8 the minimum, but I'm not sure where they got to. Yeah, just petered out I think? All significant distros come with a 4.8 as system compiler. Plain "m" works, how much does the "<>" affect code gen in practice? A quick diff here shows no difference from removing "<>". It will make it impossible to use update-form instructions here. That probably does not matter much at all, in this case. If you remove the "<>" constraints, also remove the "%Un" output modifier? So like this? diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 62cc8d7640ec..ca847aed8e45 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -207,10 +207,10 @@ do { \ #define __put_user_asm_goto(x, addr, label, op) \ asm volatile goto( \ - "1:" op "%U1%X1 %0,%1# put_user\n" \ + "1:" op "%X1 %0,%1 # put_user\n" \ EX_TABLE(1b, %l2) \ : \ - : "r" (x), "m<>" (*addr) \ + : "r" (x), "m" (*addr) \ : \ : label) Like that. But you will have to do that to *all* places we use the "<>" constraints, or wait for more stuff to fail? And, there probably are places we *do* want update form insns used (they do help in some loops, for example)? AFAICT, git grep "m<>" provides no result. However many places have %Ux: arch/powerpc/boot/io.h: __asm__ __volatile__("lbz%U1%X1 %0,%1; twi 0,%0,0; isync" arch/powerpc/boot/io.h: __asm__ __volatile__("stb%U0%X0 %1,%0; sync" arch/powerpc/boot/io.h: __asm__ __volatile__("lhz%U1%X1 %0,%1; twi 0,%0,0; isync" arch/powerpc/boot/io.h: __asm__ __volatile__("sth%U0%X0 %1,%0; sync" arch/powerpc/boot/io.h: __asm__ __volatile__("lwz%U1%X1 %0,%1; twi 0,%0,0; isync" arch/powerpc/boot/io.h: __asm__ __volatile__("stw%U0%X0 %1,%0; sync" arch/powerpc/include/asm/atomic.h: __asm__ __volatile__("lwz%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); arch/powerpc/include/asm/atomic.h: __asm__ __volatile__("stw%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i)); arch/powerpc/include/asm/atomic.h: __asm__ __volatile__("ld%U1%X1 %0,%1" : "=r"(t) : "m"(v->counter)); arch/powerpc/include/asm/atomic.h: __asm__ __volatile__("std%U0%X0 %1,%0" : "=m"(v->counter) : "r"(i)); arch/powerpc/include/asm/book3s/32/pgtable.h: stw%U0%X0 %2,%0\n\ arch/powerpc/include/asm/book3s/32/pgtable.h: stw%U0%X0 %L2,%1" arch/powerpc/include/asm/io.h: __asm__ __volatile__("sync;"#insn"%U1%X1 %0,%1;twi 0,%0,0;isync"\ arch/powerpc/include/asm/io.h: __asm__ __volatile__("sync;"#insn"%U0%X0 %1,%0" \ arch/powerpc/include/asm/nohash/pgtable.h: stw%U0%X0 %2,%0\n\ arch/powerpc/include/asm/nohash/pgtable.h: stw%U0%X0 %L2,%1" arch/powerpc/kvm/powerpc.c: asm ("lfs%U1%X1 0,%1; stfd%U0%X0 0,%0" : "=m" (fprd) : "m" (fprs) arch/powerpc/kvm/powerpc.c: asm ("lfd%U1%X1 0,%1; stfs%U0%X0 0,%0" : "=m" (fprs) : "m" (fprd) Christophe
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Wed, May 06, 2020 at 11:36:00AM +1000, Michael Ellerman wrote: > >> As far as I understood that's mandatory on recent gcc to get the > >> pre-update form of the instruction. With older versions "m" was doing > >> the same, but not anymore. > > > > Yes. How much that matters depends on the asm. On older CPUs (6xx/7xx, > > say) the update form was just as fast as the non-update form. On newer > > or bigger CPUs it is usually executed just the same as an add followed > > by the memory access, so it just saves a bit of code size. > > The update-forms are stdux, sthux etc. right? And stdu, sthu, etc. "x" is "indexed form" (reg+reg addressing). > I don't see any change in the number of those with or without the > constraint. That's using GCC 9.3.0. It's most useful in loops (and happens more often there). You probably do not have many loops that allow update form insns. > >> Should we ifdef the "m<>" or "m" based on GCC > >> version ? > > > > That will be a lot of churn. Just make 4.8 minimum? > > As I said in my other mail that's not really up to us. We could mandate > a higher minimum for powerpc, but I'd rather not. Yeah, I quite understand that. > I think for now I'm inclined to just drop the "<>", and we can revisit > in a release or two when hopefully GCC 4.8 has become the minimum. An unhappy resolution, but it leaves a light on the horizon :-) In that case, leave the "%Un", if you will but the "<>" back soonish? Not much point removing it and putting it back later (it is harmless, there are more instances of it in the kernel as well, since many years). Thanks! Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Wed, May 06, 2020 at 10:58:55AM +1000, Michael Ellerman wrote: > >> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. > > > > [ You shouldn't use 4.6.3, there has been 4.6.4 since a while. And 4.6 > > is nine years old now. Most projects do not support < 4.8 anymore, on > > any architecture. ] > > Moving up to 4.6.4 wouldn't actually help with this though would it? Nope. But 4.6.4 is a bug-fix release, 91 bugs fixed since 4.6.3, so you should switch to it if you can :-) > Also I have 4.6.3 compilers already built, I don't really have time to > rebuild them for 4.6.4. > > The kernel has a top-level minimum version, which I'm not in charge of, see: > > https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc Yes, I know. And it is much preferred not to have stricter requirements for Power, I know that too. Something has to give though :-/ > There were discussions about making 4.8 the minimum, but I'm not sure > where they got to. Yeah, just petered out I think? All significant distros come with a 4.8 as system compiler. > >> Plain "m" works, how much does the "<>" affect code gen in practice? > >> > >> A quick diff here shows no difference from removing "<>". > > > > It will make it impossible to use update-form instructions here. That > > probably does not matter much at all, in this case. > > > > If you remove the "<>" constraints, also remove the "%Un" output modifier? > > So like this? > > diff --git a/arch/powerpc/include/asm/uaccess.h > b/arch/powerpc/include/asm/uaccess.h > index 62cc8d7640ec..ca847aed8e45 100644 > --- a/arch/powerpc/include/asm/uaccess.h > +++ b/arch/powerpc/include/asm/uaccess.h > @@ -207,10 +207,10 @@ do { > \ > > #define __put_user_asm_goto(x, addr, label, op) \ > asm volatile goto( \ > - "1: " op "%U1%X1 %0,%1 # put_user\n" \ > + "1: " op "%X1 %0,%1 # put_user\n" \ > EX_TABLE(1b, %l2) \ > : \ > - : "r" (x), "m<>" (*addr)\ > + : "r" (x), "m" (*addr) \ > : \ > : label) Like that. But you will have to do that to *all* places we use the "<>" constraints, or wait for more stuff to fail? And, there probably are places we *do* want update form insns used (they do help in some loops, for example)? Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Segher Boessenkool writes: > On Tue, May 05, 2020 at 05:40:21PM +0200, Christophe Leroy wrote: >> >>+#define __put_user_asm_goto(x, addr, label, op) \ >> >>+ asm volatile goto( \ >> >>+ "1: " op "%U1%X1 %0,%1 # put_user\n" \ >> >>+ EX_TABLE(1b, %l2) \ >> >>+ : \ >> >>+ : "r" (x), "m<>" (*addr)\ >> > >> >The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. >> > >> >Plain "m" works, how much does the "<>" affect code gen in practice? >> > >> >A quick diff here shows no difference from removing "<>". >> >> It was recommended by Segher, there has been some discussion about it on >> v1 of this patch, see >> https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/ >> >> As far as I understood that's mandatory on recent gcc to get the >> pre-update form of the instruction. With older versions "m" was doing >> the same, but not anymore. > > Yes. How much that matters depends on the asm. On older CPUs (6xx/7xx, > say) the update form was just as fast as the non-update form. On newer > or bigger CPUs it is usually executed just the same as an add followed > by the memory access, so it just saves a bit of code size. The update-forms are stdux, sthux etc. right? I don't see any change in the number of those with or without the constraint. That's using GCC 9.3.0. >> Should we ifdef the "m<>" or "m" based on GCC >> version ? > > That will be a lot of churn. Just make 4.8 minimum? As I said in my other mail that's not really up to us. We could mandate a higher minimum for powerpc, but I'd rather not. I think for now I'm inclined to just drop the "<>", and we can revisit in a release or two when hopefully GCC 4.8 has become the minimum. cheers
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Segher Boessenkool writes: > Hi! > > On Wed, May 06, 2020 at 12:27:58AM +1000, Michael Ellerman wrote: >> Christophe Leroy writes: >> > unsafe_put_user() is designed to take benefit of 'asm goto'. >> > >> > Instead of using the standard __put_user() approach and branch >> > based on the returned error, use 'asm goto' and make the >> > exception code branch directly to the error label. There is >> > no code anymore in the fixup section. >> > >> > This change significantly simplifies functions using >> > unsafe_put_user() >> > >> ... >> > >> > Signed-off-by: Christophe Leroy >> > --- >> > arch/powerpc/include/asm/uaccess.h | 61 +- >> > 1 file changed, 52 insertions(+), 9 deletions(-) >> > >> > diff --git a/arch/powerpc/include/asm/uaccess.h >> > b/arch/powerpc/include/asm/uaccess.h >> > index 9cc9c106ae2a..9365b59495a2 100644 >> > --- a/arch/powerpc/include/asm/uaccess.h >> > +++ b/arch/powerpc/include/asm/uaccess.h >> > @@ -196,6 +193,52 @@ do { >> > \ >> > }) >> > >> > >> > +#define __put_user_asm_goto(x, addr, label, op) \ >> > + asm volatile goto( \ >> > + "1: " op "%U1%X1 %0,%1 # put_user\n" \ >> > + EX_TABLE(1b, %l2) \ >> > + : \ >> > + : "r" (x), "m<>" (*addr)\ >> >> The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. > > [ You shouldn't use 4.6.3, there has been 4.6.4 since a while. And 4.6 > is nine years old now. Most projects do not support < 4.8 anymore, on > any architecture. ] Moving up to 4.6.4 wouldn't actually help with this though would it? Also I have 4.6.3 compilers already built, I don't really have time to rebuild them for 4.6.4. The kernel has a top-level minimum version, which I'm not in charge of, see: https://www.kernel.org/doc/html/latest/process/changes.html?highlight=gcc There were discussions about making 4.8 the minimum, but I'm not sure where they got to. >> Plain "m" works, how much does the "<>" affect code gen in practice? >> >> A quick diff here shows no difference from removing "<>". > > It will make it impossible to use update-form instructions here. That > probably does not matter much at all, in this case. > > If you remove the "<>" constraints, also remove the "%Un" output modifier? So like this? diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 62cc8d7640ec..ca847aed8e45 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -207,10 +207,10 @@ do { \ #define __put_user_asm_goto(x, addr, label, op)\ asm volatile goto( \ - "1: " op "%U1%X1 %0,%1 # put_user\n" \ + "1: " op "%X1 %0,%1 # put_user\n" \ EX_TABLE(1b, %l2) \ : \ - : "r" (x), "m<>" (*addr)\ + : "r" (x), "m" (*addr) \ : \ : label) cheers
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Hi, Le 05/05/2020 à 16:27, Michael Ellerman a écrit : Christophe Leroy writes: unsafe_put_user() is designed to take benefit of 'asm goto'. Instead of using the standard __put_user() approach and branch based on the returned error, use 'asm goto' and make the exception code branch directly to the error label. There is no code anymore in the fixup section. This change significantly simplifies functions using unsafe_put_user() ... Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/uaccess.h | 61 +- 1 file changed, 52 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 9cc9c106ae2a..9365b59495a2 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -196,6 +193,52 @@ do { \ }) +#define __put_user_asm_goto(x, addr, label, op) \ + asm volatile goto( \ + "1:" op "%U1%X1 %0,%1# put_user\n" \ + EX_TABLE(1b, %l2) \ + : \ + : "r" (x), "m<>" (*addr) \ The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. Plain "m" works, how much does the "<>" affect code gen in practice? A quick diff here shows no difference from removing "<>". It was recommended by Segher, there has been some discussion about it on v1 of this patch, see https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/ As far as I understood that's mandatory on recent gcc to get the pre-update form of the instruction. With older versions "m" was doing the same, but not anymore. Should we ifdef the "m<>" or "m" based on GCC version ? Christophe
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
On Tue, May 05, 2020 at 05:40:21PM +0200, Christophe Leroy wrote: > >>+#define __put_user_asm_goto(x, addr, label, op)\ > >>+ asm volatile goto( \ > >>+ "1: " op "%U1%X1 %0,%1 # put_user\n" \ > >>+ EX_TABLE(1b, %l2) \ > >>+ : \ > >>+ : "r" (x), "m<>" (*addr)\ > > > >The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. > > > >Plain "m" works, how much does the "<>" affect code gen in practice? > > > >A quick diff here shows no difference from removing "<>". > > It was recommended by Segher, there has been some discussion about it on > v1 of this patch, see > https://patchwork.ozlabs.org/project/linuxppc-dev/patch/4fdc2aba6f5e51887d1cd0fee94be0989eada2cd.1586942312.git.christophe.le...@c-s.fr/ > > As far as I understood that's mandatory on recent gcc to get the > pre-update form of the instruction. With older versions "m" was doing > the same, but not anymore. Yes. How much that matters depends on the asm. On older CPUs (6xx/7xx, say) the update form was just as fast as the non-update form. On newer or bigger CPUs it is usually executed just the same as an add followed by the memory access, so it just saves a bit of code size. > Should we ifdef the "m<>" or "m" based on GCC > version ? That will be a lot of churn. Just make 4.8 minimum? Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Hi! On Wed, May 06, 2020 at 12:27:58AM +1000, Michael Ellerman wrote: > Christophe Leroy writes: > > unsafe_put_user() is designed to take benefit of 'asm goto'. > > > > Instead of using the standard __put_user() approach and branch > > based on the returned error, use 'asm goto' and make the > > exception code branch directly to the error label. There is > > no code anymore in the fixup section. > > > > This change significantly simplifies functions using > > unsafe_put_user() > > > ... > > > > Signed-off-by: Christophe Leroy > > --- > > arch/powerpc/include/asm/uaccess.h | 61 +- > > 1 file changed, 52 insertions(+), 9 deletions(-) > > > > diff --git a/arch/powerpc/include/asm/uaccess.h > > b/arch/powerpc/include/asm/uaccess.h > > index 9cc9c106ae2a..9365b59495a2 100644 > > --- a/arch/powerpc/include/asm/uaccess.h > > +++ b/arch/powerpc/include/asm/uaccess.h > > @@ -196,6 +193,52 @@ do { > > \ > > }) > > > > > > +#define __put_user_asm_goto(x, addr, label, op)\ > > + asm volatile goto( \ > > + "1: " op "%U1%X1 %0,%1 # put_user\n" \ > > + EX_TABLE(1b, %l2) \ > > + : \ > > + : "r" (x), "m<>" (*addr)\ > > The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. [ You shouldn't use 4.6.3, there has been 4.6.4 since a while. And 4.6 is nine years old now. Most projects do not support < 4.8 anymore, on any architecture. ] > Plain "m" works, how much does the "<>" affect code gen in practice? > > A quick diff here shows no difference from removing "<>". It will make it impossible to use update-form instructions here. That probably does not matter much at all, in this case. If you remove the "<>" constraints, also remove the "%Un" output modifier? Segher
Re: [PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
Christophe Leroy writes: > unsafe_put_user() is designed to take benefit of 'asm goto'. > > Instead of using the standard __put_user() approach and branch > based on the returned error, use 'asm goto' and make the > exception code branch directly to the error label. There is > no code anymore in the fixup section. > > This change significantly simplifies functions using > unsafe_put_user() > ... > > Signed-off-by: Christophe Leroy > --- > arch/powerpc/include/asm/uaccess.h | 61 +- > 1 file changed, 52 insertions(+), 9 deletions(-) > > diff --git a/arch/powerpc/include/asm/uaccess.h > b/arch/powerpc/include/asm/uaccess.h > index 9cc9c106ae2a..9365b59495a2 100644 > --- a/arch/powerpc/include/asm/uaccess.h > +++ b/arch/powerpc/include/asm/uaccess.h > @@ -196,6 +193,52 @@ do { > \ > }) > > > +#define __put_user_asm_goto(x, addr, label, op) \ > + asm volatile goto( \ > + "1: " op "%U1%X1 %0,%1 # put_user\n" \ > + EX_TABLE(1b, %l2) \ > + : \ > + : "r" (x), "m<>" (*addr)\ The "m<>" here is breaking GCC 4.6.3, which we allegedly still support. Plain "m" works, how much does the "<>" affect code gen in practice? A quick diff here shows no difference from removing "<>". cheers
[PATCH v4 1/2] powerpc/uaccess: Implement unsafe_put_user() using 'asm goto'
unsafe_put_user() is designed to take benefit of 'asm goto'. Instead of using the standard __put_user() approach and branch based on the returned error, use 'asm goto' and make the exception code branch directly to the error label. There is no code anymore in the fixup section. This change significantly simplifies functions using unsafe_put_user() Small exemple of the benefit with the following code: struct test { u32 item1; u16 item2; u8 item3; u64 item4; }; int set_test_to_user(struct test __user *test, u32 item1, u16 item2, u8 item3, u64 item4) { unsafe_put_user(item1, &test->item1, failed); unsafe_put_user(item2, &test->item2, failed); unsafe_put_user(item3, &test->item3, failed); unsafe_put_user(item4, &test->item4, failed); return 0; failed: return -EFAULT; } Before the patch: 0be8 : be8: 39 20 00 00 li r9,0 bec: 90 83 00 00 stw r4,0(r3) bf0: 2f 89 00 00 cmpwi cr7,r9,0 bf4: 40 9e 00 38 bne cr7,c2c bf8: b0 a3 00 04 sth r5,4(r3) bfc: 2f 89 00 00 cmpwi cr7,r9,0 c00: 40 9e 00 2c bne cr7,c2c c04: 98 c3 00 06 stb r6,6(r3) c08: 2f 89 00 00 cmpwi cr7,r9,0 c0c: 40 9e 00 20 bne cr7,c2c c10: 90 e3 00 08 stw r7,8(r3) c14: 91 03 00 0c stw r8,12(r3) c18: 21 29 00 00 subfic r9,r9,0 c1c: 7d 29 49 10 subfe r9,r9,r9 c20: 38 60 ff f2 li r3,-14 c24: 7d 23 18 38 and r3,r9,r3 c28: 4e 80 00 20 blr c2c: 38 60 ff f2 li r3,-14 c30: 4e 80 00 20 blr <.fixup>: ... b8: 39 20 ff f2 li r9,-14 bc: 48 00 00 00 b bc <.fixup+0xbc> bc: R_PPC_REL24 .text+0xbf0 c0: 39 20 ff f2 li r9,-14 c4: 48 00 00 00 b c4 <.fixup+0xc4> c4: R_PPC_REL24 .text+0xbfc c8: 39 20 ff f2 li r9,-14 cc: 48 00 00 00 b cc <.fixup+0xcc> cc: R_PPC_REL24 .text+0xc08 d0: 39 20 ff f2 li r9,-14 d4: 48 00 00 00 b d4 <.fixup+0xd4> d4: R_PPC_REL24 .text+0xc18 <__ex_table>: ... a0: R_PPC_REL32 .text+0xbec a4: R_PPC_REL32 .fixup+0xb8 a8: R_PPC_REL32 .text+0xbf8 ac: R_PPC_REL32 .fixup+0xc0 b0: R_PPC_REL32 .text+0xc04 b4: R_PPC_REL32 .fixup+0xc8 b8: R_PPC_REL32 .text+0xc10 bc: R_PPC_REL32 .fixup+0xd0 c0: R_PPC_REL32 .text+0xc14 c4: R_PPC_REL32 .fixup+0xd0 After the patch: 0be8 : be8: 90 83 00 00 stw r4,0(r3) bec: b0 a3 00 04 sth r5,4(r3) bf0: 98 c3 00 06 stb r6,6(r3) bf4: 90 e3 00 08 stw r7,8(r3) bf8: 91 03 00 0c stw r8,12(r3) bfc: 38 60 00 00 li r3,0 c00: 4e 80 00 20 blr c04: 38 60 ff f2 li r3,-14 c08: 4e 80 00 20 blr <__ex_table>: ... a0: R_PPC_REL32 .text+0xbe8 a4: R_PPC_REL32 .text+0xc04 a8: R_PPC_REL32 .text+0xbec ac: R_PPC_REL32 .text+0xc04 b0: R_PPC_REL32 .text+0xbf0 b4: R_PPC_REL32 .text+0xc04 b8: R_PPC_REL32 .text+0xbf4 bc: R_PPC_REL32 .text+0xc04 c0: R_PPC_REL32 .text+0xbf8 c4: R_PPC_REL32 .text+0xc04 Signed-off-by: Christophe Leroy Reviewed-by: Segher Boessenkool --- v4: - no change v3: - Added <> modifier to __put_user_asm_goto() - Removed %U1 modifier to __put_user_asm2_goto() v2: - Grouped most __goto() macros together - Removed stuff in .fixup section, referencing the error label directly from the extable - Using more flexible addressing in asm. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/uaccess.h | 61 +- 1 file changed, 52 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/uaccess.h b/arch/powerpc/include/asm/uaccess.h index 9cc9c106ae2a..9365b59495a2 100644 --- a/arch/powerpc/include/asm/uaccess.h +++ b/arch/powerpc/include/asm/uaccess.h @@ -93,12 +93,12 @@ static inline int __access_ok(unsigned long addr, unsigned long size, #define __get_user(x, ptr) \ __get_user_nocheck((x), (ptr), sizeof(*(ptr)), true) #define __put_user(x, ptr) \ - __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), true) + __put_user_nocheck((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr))) +#define __put_user_goto(x, ptr, label) \ + __put_user_nocheck_goto((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), label) #define __get_user_allowed(x, ptr) \ __get_user_no