Hi, I am pinging this because PR 91614 has been raised to P2 now: https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg00370.html
Thanks Bernd. On 9/6/19 2:12 PM, Bernd Edlinger wrote: > On 9/5/19 5:55 PM, Richard Earnshaw (lists) wrote: >> On 03/09/2019 21:45, Bernd Edlinger wrote: >>> Hi, >>> >>> due to the introduction of unaligned_loaddi and unaligned_storedi, >>> two test cases show some regression as PR 91614 points out. >>> >>> I would like to change the test expectations if these two test cases, >>> since they seem to be bogus. >>> >>> That is the test case already failed for arm_prefer_ldrd_strd targets, >>> since not a ldrd or strd is created but a ldm/stm instead, which >>> is probably not what is intended. >>> >>> That is for arm_prefer_ldrd_strd the previously used instruction movdi: >>> >>> (insn 11 10 12 2 (set (mem/c:DI (reg:SI 112) [0 MEM <charD.4[1:15]> >>> [(voidD.71 *)&destD.5932]+0 S8 A32]) >>> (reg:DI 114)) "unaligned-memcpy-2.c":11:3 642 {*movdi_vfp} >>> (nil)) >>> >>> is split up very early in subreg1 into: >>> >>> (insn 21 10 22 2 (set (mem/c:SI (reg:SI 112) [0 MEM <charD.4[1:15]> >>> [(voidD.71 *)&destD.5932]+0 S4 A32]) >>> (reg:SI 119)) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp} >>> (nil)) >>> (insn 22 21 12 2 (set (mem/c:SI (plus:SI (reg:SI 112) >>> (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> [(voidD.71 >>> *)&destD.5932]+4 S4 A32]) >>> (reg:SI 120 [+4 ])) "unaligned-memcpy-2.c":11:3 640 >>> {*arm_movsi_vfp} >>> (nil)) >>> >>> which is finally replaced by: >>> >>> (insn 35 10 12 2 (parallel [ >>> (set (mem/c:SI (reg/f:SI 3 r3 [111]) [0 MEM <charD.4[1:15]> >>> [(voidD.71 *)&destD.5932]+0 S4 A32]) >>> (reg:SI 1 r1)) >>> (set (mem/c:SI (plus:SI (reg/f:SI 3 r3 [111]) >>> (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> >>> [(voidD.71 *)&destD.5932]+4 S4 A32]) >>> (reg:SI 2 r2)) >>> ]) "unaligned-memcpy-2.c":11:3 378 {*stm2_} >>> (expr_list:REG_DEAD (reg:SI 2 r2) >>> (expr_list:REG_DEAD (reg:SI 1 r1) >>> (nil)))) >>> >>> ... so stm instead of strd. >>> >>> Since the unalinged_storedi is an unspec, it is expanded as strd which is >>> kind of okay, but there is register pair spilled which was not there >>> previously: >>> >>> aligned_dest: >>> @ args = 0, pretend = 0, frame = 0 >>> @ frame_needed = 0, uses_anonymous_args = 0 >>> @ link register save eliminated. >>> strd r4, [sp, #-8]! >>> ldr r4, [r0] @ unaligned >>> movw r3, #:lower16:.LANCHOR0 >>> ldr r5, [r0, #4] @ unaligned >>> movt r3, #:upper16:.LANCHOR0 >>> strd r4, [r3] >>> ldr r2, [r0, #8] @ unaligned >>> str r2, [r3, #8] >>> ldrh r2, [r0, #12] @ unaligned >>> strh r2, [r3, #12] @ movhi >>> ldrb r2, [r0, #14] @ zero_extendqisi2 >>> strb r2, [r3, #14] >>> ldrd r4, [sp] >>> add sp, sp, #8 >>> bx lr >>> >>> The patch filters out ldrd/strd that sp-relative, and adds a few >>> missing scan-assembler rules, in order to make the test results more stable. >>> >>> Alternative could be to use two movsi instead of the unaligned_load/storedi, >>> but that would end up in ldm/stm which looks plain wrong to me. >>> >>> >>> Tested using various arm-linux-gnueabihf cross-compilers. >>> Is it OK for trunk? >>> >>> >>> Thanks >>> Bernd. >>> >> 2019-09-03 Bernd Edlinger <bernd.edlin...@hotmail.de> >> >> PR target/91614 >> * gcc.target/arm/unaligned-memcpy-2.c: Adjust test expectations. >> * gcc.target/arm/unaligned-memcpy-3.c: Likewise. >> >> Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c >> =================================================================== >> --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (Revision 275341) >> +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c (Arbeitskopie) >> @@ -15,9 +15,11 @@ >> loads/stores for the remainder. */ >> >> /* { dg-final { scan-assembler-times "ldmia" 0 } } */ >> -/* { dg-final { scan-assembler-times "ldrd" 0 } } */ >> +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */ >> +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */ >> /* { dg-final { scan-assembler-times "stmia" 1 { target { ! { >> arm_prefer_ldrd_strd } } } } } */ >> -/* { dg-final { scan-assembler-times "strd" 1 { target { >> arm_prefer_ldrd_strd } } } } */ >> +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 1 { target { >> arm_prefer_ldrd_strd } } } } */ >> +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */ >> /* { dg-final { scan-assembler-times "ldrh" 1 } } */ >> /* { dg-final { scan-assembler-times "strh" 1 } } */ >> /* { dg-final { scan-assembler-times "ldrb" 1 } } */ >> Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c >> =================================================================== >> --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (Revision 275341) >> +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c (Arbeitskopie) >> @@ -15,10 +15,12 @@ >> loads/stores for the remainder. */ >> >> /* { dg-final { scan-assembler-times "ldmia" 1 { target { ! { >> arm_prefer_ldrd_strd } } } } } */ >> -/* { dg-final { scan-assembler-times "ldrd" 1 { target { >> arm_prefer_ldrd_strd } } } } */ >> -/* { dg-final { scan-assembler-times "strd" 0 } } */ >> -/* { dg-final { scan-assembler-times "stm" 0 } } */ >> -/* { dg-final { scan-assembler-times "ldrh" 1 { target { ! { >> arm_prefer_ldrd_strd } } } } } */ >> +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 1 { target { >> arm_prefer_ldrd_strd } } } } */ >> +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */ >> +/* { dg-final { scan-assembler-times "stmia" 0 } } */ >> +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 0 } } */ >> +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */ >> +/* { dg-final { scan-assembler-times "ldrh" 1 } } */ >> /* { dg-final { scan-assembler-times "strh" 1 } } */ >> -/* { dg-final { scan-assembler-times "ldrb" 1 { target { ! { >> arm_prefer_ldrd_strd } } } } } */ >> +/* { dg-final { scan-assembler-times "ldrb" 1 } } */ >> /* { dg-final { scan-assembler-times "strb" 1 } } */ >> >> Ug! That's horrible! >> > > Yes, I know, this won't win a prize for it's beauty :-) > > But to catch all possible corner cases can't be easy. > >> Also, it's testing things we really shouldn't be concerned about. For >> example, the ldrh+ldrb+strh+strb is probably better done as ldr [offset-1] + >> str [offset-1] to handle the tail (ie re-copying the byte 4th from the end). >> When someone does that, this test will then needlessly fail. >> > > Okay, removed those then. And scan for ldrd _or_ ldm, > respectively strd _or stm, which both are acceptable since the > aligned side is effectively 4-byte aligned. > >> I'm not really convinced that this sort of thing can be properly tested with >> scan-assembler tests, there's just too much that might change. Perhaps we >> should just accept this and create a simple executable test that will fault >> if run on a target that doesn't support the code we generate... >> > > Yes, that would be good as well, but runtime tests can't test if a > specific optimization like ldrd/strd is _not_ used when it should be. > > > > So how about this? > > Is it OK for trunk? > > > Thanks > Bernd. >