Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 06/09/2019 11:15, Bernd Edlinger wrote: On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */ I don't think this test is right. While we can't use an LDRD to load the argument off the stack, there's nothing wrong with using an STRD to then store the value to f0 (as that is 8-byte aligned). So the second and third scan-assembler tests are meaningless. R. (sorry, just noticed this). So, agreed, that is really likely to change. I would just remove those, as attached. Is that OK for trunk? Thanks Bernd. OK. R.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: > Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c > === > --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) > +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_arm_ok } */ > +/* { dg-require-effective-target arm_ldrd_strd_ok } */ > +/* { dg-options "-marm -mno-unaligned-access -O3" } */ > + > +struct s { > + int a, b; > +} __attribute__((aligned(8))); > + > +struct s f0; > + > +void f(int a, int b, int c, int d, int e, struct s f) > +{ > + f0 = f; > +} > + > +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ > +/* { dg-final { scan-assembler-times "strd" 0 } } */ > +/* { dg-final { scan-assembler-times "stm" 1 } } */ > > I don't think this test is right. While we can't use an LDRD to load the > argument off the stack, there's nothing wrong with using an STRD to then > store the value to f0 (as that is 8-byte aligned). So the second and third > scan-assembler tests are meaningless. > > R. > > (sorry, just noticed this). So, agreed, that is really likely to change. I would just remove those, as attached. Is that OK for trunk? Thanks Bernd. 2019-09-06 Bernd Edlinger * gcc.target/arm/unaligned-argument-2.c: Remove bogus test cases. Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (revision 275409) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (working copy) @@ -15,5 +15,3 @@ void f(int a, int b, int c, int d, int e, struct s } /* { dg-final { scan-assembler-times "ldrd" 0 } } */ -/* { dg-final { scan-assembler-times "strd" 0 } } */ -/* { dg-final { scan-assembler-times "stm" 1 } } */
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 9/5/19 11:21 AM, Richard Earnshaw (lists) wrote: > On 04/09/2019 16:48, Richard Earnshaw (lists) wrote: >> On 04/09/2019 16:00, Bernd Edlinger wrote: >>> On 9/4/19 4:14 PM, Richard Earnshaw (lists) wrote: On 04/09/2019 14:28, Bernd Edlinger wrote: > On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: >> Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c >> === >> --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) >> +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) >> @@ -0,0 +1,19 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target arm_arm_ok } */ >> +/* { dg-require-effective-target arm_ldrd_strd_ok } */ >> +/* { dg-options "-marm -mno-unaligned-access -O3" } */ >> + >> +struct s { >> + int a, b; >> +} __attribute__((aligned(8))); >> + >> +struct s f0; >> + >> +void f(int a, int b, int c, int d, int e, struct s f) >> +{ >> + f0 = f; >> +} >> + >> +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ >> +/* { dg-final { scan-assembler-times "strd" 0 } } */ >> +/* { dg-final { scan-assembler-times "stm" 1 } } */ >> >> I don't think this test is right. While we can't use an LDRD to load >> the argument off the stack, there's nothing wrong with using an STRD to >> then store the value to f0 (as that is 8-byte aligned). So the second >> and third scan-assembler tests are meaningless. >> > > Ah, that is very similar to the unaligned-memcpy-2/3.c, > see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html > > initially that is a movdi, > then in subreg1 it is split in two movsi > which is then re-assembled as ldm > > > Not sure if that is intended in that way. > > Yeah, these are causing me some problems too, but that's because with some changes I'm working on I now see the compiler using r4 and r5, which leads to prologue and epilogue stores that distort the results. Tests like this are generally fragile - I hate 'em >>> >>> Yeah, that changed since r275063 introduced the unaligned-load/storedi >>> >>> r275063 | edlinger | 2019-08-30 12:38:37 +0200 (Fr, 30. Aug 2019) | 10 >>> Zeilen >>> Geänderte Pfade: >>> M /trunk/gcc/ChangeLog >>> M /trunk/gcc/config/arm/arm.c >>> M /trunk/gcc/config/arm/arm.md >>> M /trunk/gcc/config/arm/neon.md >>> >>> 2019-08-30 Bernd Edlinger >>> >>> * config/arm/arm.md (unaligned_loaddi, >>> unaligned_storedi): New unspec insn patterns. >>> * config/arm/neon.md (unaligned_storev8qi): Likewise. >>> * config/arm/arm.c (gen_cpymem_ldrd_strd): Use unaligned_loaddi >>> and unaligned_storedi for 4-byte aligned memory. >>> (arm_block_set_aligned_vect): Use unaligned_storev8qi for >>> 4-byte aligned memory. >>> >>> Since other than the movdi they are not split up but stay as ldrd/strd. >>> But for some unknown reason ira assigns r4-5 to those although also >>> r1-2 would be available. :-( >>> >> >> r1-r2 can't be used in Arm state as the register has to start on an even >> boundary. But ira has already used r3 for the address of the store (it >> could have picked r1) and now r4-r5 is the next even-numbered pair. So we >> end up with needing to save some call-clobbered regs. >> >> R. >>> >>> Bernd. >>> >> > > One possible trick to stabilize the test is to insert an asm that clobbers r4 > and r5 and forces the prologue/epilogue code to always save and restore them. > Then we can account for those prologue/epilogue consistently (at least, > modulo the arm_prefer_ldrd_strd condition). > Yes, or add -fno-omit-frame-pointer. BTW: have you seen this negative lookahead in my patch [PATCH] [ARM] Adjust test expectations of unaligned-memcpy-2/3.c (PR 91614) https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html /* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */ it makes the test work for all possible combinations with RUNTESTFLAGS="--target_board=unix\{-mcpu=cortex-a15,-mcpu=cortex-a57,-mcpu=cortex-a9,-mcpu=cortex-a8,-mcpu=cortex-a7\}\{-fno-omit-frame-pointer,\}" Cool isn't it? Bernd.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 04/09/2019 16:48, Richard Earnshaw (lists) wrote: On 04/09/2019 16:00, Bernd Edlinger wrote: On 9/4/19 4:14 PM, Richard Earnshaw (lists) wrote: On 04/09/2019 14:28, Bernd Edlinger wrote: On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */ I don't think this test is right. While we can't use an LDRD to load the argument off the stack, there's nothing wrong with using an STRD to then store the value to f0 (as that is 8-byte aligned). So the second and third scan-assembler tests are meaningless. Ah, that is very similar to the unaligned-memcpy-2/3.c, see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html initially that is a movdi, then in subreg1 it is split in two movsi which is then re-assembled as ldm Not sure if that is intended in that way. Yeah, these are causing me some problems too, but that's because with some changes I'm working on I now see the compiler using r4 and r5, which leads to prologue and epilogue stores that distort the results. Tests like this are generally fragile - I hate 'em Yeah, that changed since r275063 introduced the unaligned-load/storedi r275063 | edlinger | 2019-08-30 12:38:37 +0200 (Fr, 30. Aug 2019) | 10 Zeilen Geänderte Pfade: M /trunk/gcc/ChangeLog M /trunk/gcc/config/arm/arm.c M /trunk/gcc/config/arm/arm.md M /trunk/gcc/config/arm/neon.md 2019-08-30 Bernd Edlinger * config/arm/arm.md (unaligned_loaddi, unaligned_storedi): New unspec insn patterns. * config/arm/neon.md (unaligned_storev8qi): Likewise. * config/arm/arm.c (gen_cpymem_ldrd_strd): Use unaligned_loaddi and unaligned_storedi for 4-byte aligned memory. (arm_block_set_aligned_vect): Use unaligned_storev8qi for 4-byte aligned memory. Since other than the movdi they are not split up but stay as ldrd/strd. But for some unknown reason ira assigns r4-5 to those although also r1-2 would be available. :-( r1-r2 can't be used in Arm state as the register has to start on an even boundary. But ira has already used r3 for the address of the store (it could have picked r1) and now r4-r5 is the next even-numbered pair. So we end up with needing to save some call-clobbered regs. R. Bernd. One possible trick to stabilize the test is to insert an asm that clobbers r4 and r5 and forces the prologue/epilogue code to always save and restore them. Then we can account for those prologue/epilogue consistently (at least, modulo the arm_prefer_ldrd_strd condition). R.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 04/09/2019 16:00, Bernd Edlinger wrote: On 9/4/19 4:14 PM, Richard Earnshaw (lists) wrote: On 04/09/2019 14:28, Bernd Edlinger wrote: On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */ I don't think this test is right. While we can't use an LDRD to load the argument off the stack, there's nothing wrong with using an STRD to then store the value to f0 (as that is 8-byte aligned). So the second and third scan-assembler tests are meaningless. Ah, that is very similar to the unaligned-memcpy-2/3.c, see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html initially that is a movdi, then in subreg1 it is split in two movsi which is then re-assembled as ldm Not sure if that is intended in that way. Yeah, these are causing me some problems too, but that's because with some changes I'm working on I now see the compiler using r4 and r5, which leads to prologue and epilogue stores that distort the results. Tests like this are generally fragile - I hate 'em Yeah, that changed since r275063 introduced the unaligned-load/storedi r275063 | edlinger | 2019-08-30 12:38:37 +0200 (Fr, 30. Aug 2019) | 10 Zeilen Geänderte Pfade: M /trunk/gcc/ChangeLog M /trunk/gcc/config/arm/arm.c M /trunk/gcc/config/arm/arm.md M /trunk/gcc/config/arm/neon.md 2019-08-30 Bernd Edlinger * config/arm/arm.md (unaligned_loaddi, unaligned_storedi): New unspec insn patterns. * config/arm/neon.md (unaligned_storev8qi): Likewise. * config/arm/arm.c (gen_cpymem_ldrd_strd): Use unaligned_loaddi and unaligned_storedi for 4-byte aligned memory. (arm_block_set_aligned_vect): Use unaligned_storev8qi for 4-byte aligned memory. Since other than the movdi they are not split up but stay as ldrd/strd. But for some unknown reason ira assigns r4-5 to those although also r1-2 would be available. :-( r1-r2 can't be used in Arm state as the register has to start on an even boundary. But ira has already used r3 for the address of the store (it could have picked r1) and now r4-r5 is the next even-numbered pair. So we end up with needing to save some call-clobbered regs. R. Bernd.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 9/4/19 4:14 PM, Richard Earnshaw (lists) wrote: > On 04/09/2019 14:28, Bernd Edlinger wrote: >> On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: >>> Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c >>> === >>> --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) >>> +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) >>> @@ -0,0 +1,19 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target arm_arm_ok } */ >>> +/* { dg-require-effective-target arm_ldrd_strd_ok } */ >>> +/* { dg-options "-marm -mno-unaligned-access -O3" } */ >>> + >>> +struct s { >>> + int a, b; >>> +} __attribute__((aligned(8))); >>> + >>> +struct s f0; >>> + >>> +void f(int a, int b, int c, int d, int e, struct s f) >>> +{ >>> + f0 = f; >>> +} >>> + >>> +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ >>> +/* { dg-final { scan-assembler-times "strd" 0 } } */ >>> +/* { dg-final { scan-assembler-times "stm" 1 } } */ >>> >>> I don't think this test is right. While we can't use an LDRD to load the >>> argument off the stack, there's nothing wrong with using an STRD to then >>> store the value to f0 (as that is 8-byte aligned). So the second and third >>> scan-assembler tests are meaningless. >>> >> >> Ah, that is very similar to the unaligned-memcpy-2/3.c, >> see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html >> >> initially that is a movdi, >> then in subreg1 it is split in two movsi >> which is then re-assembled as ldm >> >> >> Not sure if that is intended in that way. >> >> > > Yeah, these are causing me some problems too, but that's because with some > changes I'm working on I now see the compiler using r4 and r5, which leads to > prologue and epilogue stores that distort the results. > > Tests like this are generally fragile - I hate 'em > Yeah, that changed since r275063 introduced the unaligned-load/storedi r275063 | edlinger | 2019-08-30 12:38:37 +0200 (Fr, 30. Aug 2019) | 10 Zeilen Geänderte Pfade: M /trunk/gcc/ChangeLog M /trunk/gcc/config/arm/arm.c M /trunk/gcc/config/arm/arm.md M /trunk/gcc/config/arm/neon.md 2019-08-30 Bernd Edlinger * config/arm/arm.md (unaligned_loaddi, unaligned_storedi): New unspec insn patterns. * config/arm/neon.md (unaligned_storev8qi): Likewise. * config/arm/arm.c (gen_cpymem_ldrd_strd): Use unaligned_loaddi and unaligned_storedi for 4-byte aligned memory. (arm_block_set_aligned_vect): Use unaligned_storev8qi for 4-byte aligned memory. Since other than the movdi they are not split up but stay as ldrd/strd. But for some unknown reason ira assigns r4-5 to those although also r1-2 would be available. :-( Bernd.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 04/09/2019 14:28, Bernd Edlinger wrote: On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: On 15/08/2019 20:47, Bernd Edlinger wrote: On 8/15/19 6:29 PM, Richard Biener wrote: Please split it into the parts for the PR and parts making the asserts not trigger. Yes, will do. Okay, here is the rest of the PR 89544 fix, actually just an optimization, making the larger stack alignment known to the middle-end, and the test cases. Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd. Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */ I don't think this test is right. While we can't use an LDRD to load the argument off the stack, there's nothing wrong with using an STRD to then store the value to f0 (as that is 8-byte aligned). So the second and third scan-assembler tests are meaningless. Ah, that is very similar to the unaligned-memcpy-2/3.c, see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html initially that is a movdi, then in subreg1 it is split in two movsi which is then re-assembled as ldm Not sure if that is intended in that way. Yeah, these are causing me some problems too, but that's because with some changes I'm working on I now see the compiler using r4 and r5, which leads to prologue and epilogue stores that distort the results. Tests like this are generally fragile - I hate 'em R. Bernd.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 9/4/19 2:53 PM, Richard Earnshaw (lists) wrote: > On 15/08/2019 20:47, Bernd Edlinger wrote: >> On 8/15/19 6:29 PM, Richard Biener wrote: > > Please split it into the parts for the PR and parts making the > asserts not trigger. > Yes, will do. >> >> Okay, here is the rest of the PR 89544 fix, >> actually just an optimization, making the larger stack alignment >> known to the middle-end, and the test cases. >> >> >> Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> > > Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c > === > --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) > +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) > @@ -0,0 +1,19 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target arm_arm_ok } */ > +/* { dg-require-effective-target arm_ldrd_strd_ok } */ > +/* { dg-options "-marm -mno-unaligned-access -O3" } */ > + > +struct s { > + int a, b; > +} __attribute__((aligned(8))); > + > +struct s f0; > + > +void f(int a, int b, int c, int d, int e, struct s f) > +{ > + f0 = f; > +} > + > +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ > +/* { dg-final { scan-assembler-times "strd" 0 } } */ > +/* { dg-final { scan-assembler-times "stm" 1 } } */ > > I don't think this test is right. While we can't use an LDRD to load the > argument off the stack, there's nothing wrong with using an STRD to then > store the value to f0 (as that is 8-byte aligned). So the second and third > scan-assembler tests are meaningless. > Ah, that is very similar to the unaligned-memcpy-2/3.c, see https://gcc.gnu.org/ml/gcc-patches/2019-09/msg00157.html initially that is a movdi, then in subreg1 it is split in two movsi which is then re-assembled as ldm Not sure if that is intended in that way. Bernd.
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 15/08/2019 20:47, Bernd Edlinger wrote: On 8/15/19 6:29 PM, Richard Biener wrote: Please split it into the parts for the PR and parts making the asserts not trigger. Yes, will do. Okay, here is the rest of the PR 89544 fix, actually just an optimization, making the larger stack alignment known to the middle-end, and the test cases. Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd. Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */ I don't think this test is right. While we can't use an LDRD to load the argument off the stack, there's nothing wrong with using an STRD to then store the value to f0 (as that is 8-byte aligned). So the second and third scan-assembler tests are meaningless. R. (sorry, just noticed this).
Re: Fwd: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
Ah, yes that was unexpected... Sorry for the breakage. So this needs to be known_eq (STACK_POINTER_OFFSET, 0) instead of STACK_POINTER_OFFSET == 0 obviously. Should be fixed by this patch, which I am going to commit as "obvious" in a moment unless someone objects. Thanks Bernd. On 8/20/19 4:39 PM, John David Anglin wrote: > On 2019-08-15 3:47 p.m., Bernd Edlinger wrote: >> 2019-08-15 Bernd Edlinger >> >> PR middle-end/89544 >> * function.c (assign_parm_find_stack_rtl): Use larger alignment >> when possible. > This patch breaks build on hppa-unknown-linux-gnu: > https://buildd.debian.org/status/fetch.php?pkg=gcc-snapshot=hppa=1%3A20190820-1=1566307455=0 > > hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC > -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall > -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format > -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long > -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. > -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include > -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber > -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber > -I../../src/gcc/../libbacktrace -o function.o -MT function.o -MMD -MP -MF > ./.deps/function.TPo ../../src/gcc/function.c > hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC > -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall > -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format > -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long > -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. > -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include > -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber > -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber > -I../../src/gcc/../libbacktrace -o function-tests.o -MT function-tests.o > -MMD -MP -MF ./.deps/function-tests.TPo ../../src/gcc/function-tests.c > hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC > -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall > -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format > -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long > -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. > -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include > -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber > -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber > -I../../src/gcc/../libbacktrace -o fwprop.o -MT fwprop.o -MMD -MP -MF > ./.deps/fwprop.TPo ../../src/gcc/fwprop.c > ../../src/gcc/function.c: In function 'void assign_parm_find_stack_rtl(tree, > assign_parm_data_one*)': > ../../src/gcc/function.c:2690:28: error: no match for 'operator==' (operand > types are 'poly_int<1, long long int>' and 'int') > 2690 |&& STACK_POINTER_OFFSET == 0) > |^~ ~ > | | > | int > In file included from ../../src/gcc/coretypes.h:415, > from ../../src/gcc/function.c:36: > ../../src/gcc/wide-int.h:3287:19: note: candidate: 'template T2> typename wi::binary_traits::predicate_result operator==(const > T1&, const T2&)' > 3287 | BINARY_PREDICATE (operator ==, eq_p) > | ^~~~ > ../../src/gcc/wide-int.h:3264:3: note: in definition of macro > 'BINARY_PREDICATE' > 3264 | OP (const T1 , const T2 ) \ > | ^~ > ../../src/gcc/wide-int.h:3287:19: note: template argument > deduction/substitution failed: > 3287 | BINARY_PREDICATE (operator ==, eq_p) > | ^~~~ > ../../src/gcc/wide-int.h:3264:3: note: in definition of macro > 'BINARY_PREDICATE' > 3264 | OP (const T1 , const T2 ) \ > | ^~ > ../../src/gcc/wide-int.h: In substitution of 'template > typename wi::binary_traits::predicate_result operator==(const T1&, > const T2&) [with T1 = poly_int<1, long long int>; T2 = int]': > ../../src/gcc/function.c:2690:31: required from here > ../../src/gcc/wide-int.h:3287:19: error: incomplete type > 'wi::int_traits >' used in nested name specifier > 3287 | BINARY_PREDICATE (operator ==, eq_p) > | ^~~~ > ../../src/gcc/wide-int.h:3264:3: note: in definition of macro > 'BINARY_PREDICATE' > 3264 | OP (const T1 , const T2 ) \ > | ^~ > make[5]: *** [Makefile:1118: function.o] Error 1 > make[5]: *** Waiting for unfinished jobs > > We have the following define for STACK_POINTER_OFFSET: > > #define STACK_POINTER_OFFSET \ > (TARGET_64BIT ? -(crtl->outgoing_args_size + 48) : poly_int64 (-32)) > > Dave > 2019-08-20 Bernd Edlinger * function.c (assign_parm_find_stack_rtl): Use known_eq instead of ==. Index: gcc/function.c === ---
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 2019-08-15 3:47 p.m., Bernd Edlinger wrote: > 2019-08-15 Bernd Edlinger > > PR middle-end/89544 > * function.c (assign_parm_find_stack_rtl): Use larger alignment > when possible. This patch breaks build on hppa-unknown-linux-gnu: https://buildd.debian.org/status/fetch.php?pkg=gcc-snapshot=hppa=1%3A20190820-1=1566307455=0 hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber -I../../src/gcc/../libbacktrace -o function.o -MT function.o -MMD -MP -MF ./.deps/function.TPo ../../src/gcc/function.c hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber -I../../src/gcc/../libbacktrace -o function-tests.o -MT function-tests.o -MMD -MP -MF ./.deps/function-tests.TPo ../../src/gcc/function-tests.c hppa-linux-gnu-g++-9 -std=gnu++98 -fno-PIE -c -g -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -I. -I. -I../../src/gcc -I../../src/gcc/. -I../../src/gcc/../include -I../../src/gcc/../libcpp/include -I../../src/gcc/../libdecnumber -I../../src/gcc/../libdecnumber/dpd -I../libdecnumber -I../../src/gcc/../libbacktrace -o fwprop.o -MT fwprop.o -MMD -MP -MF ./.deps/fwprop.TPo ../../src/gcc/fwprop.c ../../src/gcc/function.c: In function 'void assign_parm_find_stack_rtl(tree, assign_parm_data_one*)': ../../src/gcc/function.c:2690:28: error: no match for 'operator==' (operand types are 'poly_int<1, long long int>' and 'int') 2690 |&& STACK_POINTER_OFFSET == 0) |^~ ~ | | | int In file included from ../../src/gcc/coretypes.h:415, from ../../src/gcc/function.c:36: ../../src/gcc/wide-int.h:3287:19: note: candidate: 'template typename wi::binary_traits::predicate_result operator==(const T1&, const T2&)' 3287 | BINARY_PREDICATE (operator ==, eq_p) | ^~~~ ../../src/gcc/wide-int.h:3264:3: note: in definition of macro 'BINARY_PREDICATE' 3264 | OP (const T1 , const T2 ) \ | ^~ ../../src/gcc/wide-int.h:3287:19: note: template argument deduction/substitution failed: 3287 | BINARY_PREDICATE (operator ==, eq_p) | ^~~~ ../../src/gcc/wide-int.h:3264:3: note: in definition of macro 'BINARY_PREDICATE' 3264 | OP (const T1 , const T2 ) \ | ^~ ../../src/gcc/wide-int.h: In substitution of 'template typename wi::binary_traits::predicate_result operator==(const T1&, const T2&) [with T1 = poly_int<1, long long int>; T2 = int]': ../../src/gcc/function.c:2690:31: required from here ../../src/gcc/wide-int.h:3287:19: error: incomplete type 'wi::int_traits >' used in nested name specifier 3287 | BINARY_PREDICATE (operator ==, eq_p) | ^~~~ ../../src/gcc/wide-int.h:3264:3: note: in definition of macro 'BINARY_PREDICATE' 3264 | OP (const T1 , const T2 ) \ | ^~ make[5]: *** [Makefile:1118: function.o] Error 1 make[5]: *** Waiting for unfinished jobs We have the following define for STACK_POINTER_OFFSET: #define STACK_POINTER_OFFSET \ (TARGET_64BIT ? -(crtl->outgoing_args_size + 48) : poly_int64 (-32)) Dave -- John David Anglin dave.ang...@bell.net
Re: [PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 8/15/19 1:47 PM, Bernd Edlinger wrote: > On 8/15/19 6:29 PM, Richard Biener wrote: Please split it into the parts for the PR and parts making the asserts not trigger. >>> Yes, will do. >>> > Okay, here is the rest of the PR 89544 fix, > actually just an optimization, making the larger stack alignment > known to the middle-end, and the test cases. > > > Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. > Is it OK for trunk? > > > Thanks > Bernd. > > > patch-arm-align-abi.diff > > 2019-08-15 Bernd Edlinger > > PR middle-end/89544 > * function.c (assign_parm_find_stack_rtl): Use larger alignment > when possible. > > testsuite: > 2019-08-15 Bernd Edlinger > > PR middle-end/89544 > * gcc.target/arm/unaligned-argument-1.c: New test. > * gcc.target/arm/unaligned-argument-2.c: New test. OK. Given the sensitivity of this code, let's give the tester a chance to run with this patch applied before we add the next one for sanitizing the middle end interface. jeff
[PATCHv5] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544)
On 8/15/19 6:29 PM, Richard Biener wrote: >>> >>> Please split it into the parts for the PR and parts making the >>> asserts not trigger. >>> >> >> Yes, will do. >> Okay, here is the rest of the PR 89544 fix, actually just an optimization, making the larger stack alignment known to the middle-end, and the test cases. Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd. 2019-08-15 Bernd Edlinger PR middle-end/89544 * function.c (assign_parm_find_stack_rtl): Use larger alignment when possible. testsuite: 2019-08-15 Bernd Edlinger PR middle-end/89544 * gcc.target/arm/unaligned-argument-1.c: New test. * gcc.target/arm/unaligned-argument-2.c: New test. Index: gcc/function.c === --- gcc/function.c (Revision 274531) +++ gcc/function.c (Arbeitskopie) @@ -2697,8 +2697,23 @@ assign_parm_find_stack_rtl (tree parm, struct assi intentionally forcing upward padding. Otherwise we have to come up with a guess at the alignment based on OFFSET_RTX. */ poly_int64 offset; - if (data->locate.where_pad != PAD_DOWNWARD || data->entry_parm) + if (data->locate.where_pad == PAD_NONE || data->entry_parm) align = boundary; + else if (data->locate.where_pad == PAD_UPWARD) +{ + align = boundary; + /* If the argument offset is actually more aligned than the nominal + stack slot boundary, take advantage of that excess alignment. + Don't make any assumptions if STACK_POINTER_OFFSET is in use. */ + if (poly_int_rtx_p (offset_rtx, ) + && STACK_POINTER_OFFSET == 0) + { + unsigned int offset_align = known_alignment (offset) * BITS_PER_UNIT; + if (offset_align == 0 || offset_align > STACK_BOUNDARY) + offset_align = STACK_BOUNDARY; + align = MAX (align, offset_align); + } +} else if (poly_int_rtx_p (offset_rtx, )) { align = least_bit_hwi (boundary); Index: gcc/testsuite/gcc.target/arm/unaligned-argument-1.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-1.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-1.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 1 } } */ +/* { dg-final { scan-assembler-times "strd" 1 } } */ +/* { dg-final { scan-assembler-times "stm" 0 } } */ Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c === --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (Arbeitskopie) @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_arm_ok } */ +/* { dg-require-effective-target arm_ldrd_strd_ok } */ +/* { dg-options "-marm -mno-unaligned-access -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */