Hi,

I am pinging this because PR 91614 has been raised to P2 now:
https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg00370.html

Thanks
Bernd.


On 9/6/19 2:12 PM, Bernd Edlinger wrote:
> On 9/5/19 5:55 PM, Richard Earnshaw (lists) wrote:
>> On 03/09/2019 21:45, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> due to the introduction of unaligned_loaddi and unaligned_storedi,
>>> two test cases show some regression as PR 91614 points out.
>>>
>>> I would like to change the test expectations if these two test cases,
>>> since they seem to be bogus.
>>>
>>> That is the test case already failed for arm_prefer_ldrd_strd targets,
>>> since not a ldrd or strd is created but a ldm/stm instead, which
>>> is probably not what is intended.
>>>
>>> That is for arm_prefer_ldrd_strd the previously used instruction movdi:
>>>
>>> (insn 11 10 12 2 (set (mem/c:DI (reg:SI 112) [0 MEM <charD.4[1:15]> 
>>> [(voidD.71 *)&destD.5932]+0 S8 A32])
>>>          (reg:DI 114)) "unaligned-memcpy-2.c":11:3 642 {*movdi_vfp}
>>>       (nil))
>>>
>>> is split up very early in subreg1 into:
>>>
>>> (insn 21 10 22 2 (set (mem/c:SI (reg:SI 112) [0 MEM <charD.4[1:15]> 
>>> [(voidD.71 *)&destD.5932]+0 S4 A32])
>>>          (reg:SI 119)) "unaligned-memcpy-2.c":11:3 640 {*arm_movsi_vfp}
>>>       (nil))
>>> (insn 22 21 12 2 (set (mem/c:SI (plus:SI (reg:SI 112)
>>>                  (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> [(voidD.71 
>>> *)&destD.5932]+4 S4 A32])
>>>          (reg:SI 120 [+4 ])) "unaligned-memcpy-2.c":11:3 640 
>>> {*arm_movsi_vfp}
>>>       (nil))
>>>
>>> which is finally replaced by:
>>>
>>> (insn 35 10 12 2 (parallel [
>>>              (set (mem/c:SI (reg/f:SI 3 r3 [111]) [0 MEM <charD.4[1:15]> 
>>> [(voidD.71 *)&destD.5932]+0 S4 A32])
>>>                  (reg:SI 1 r1))
>>>              (set (mem/c:SI (plus:SI (reg/f:SI 3 r3 [111])
>>>                          (const_int 4 [0x4])) [0 MEM <charD.4[1:15]> 
>>> [(voidD.71 *)&destD.5932]+4 S4 A32])
>>>                  (reg:SI 2 r2))
>>>          ]) "unaligned-memcpy-2.c":11:3 378 {*stm2_}
>>>       (expr_list:REG_DEAD (reg:SI 2 r2)
>>>          (expr_list:REG_DEAD (reg:SI 1 r1)
>>>              (nil))))
>>>
>>> ... so stm instead of strd.
>>>
>>> Since the unalinged_storedi is an unspec, it is expanded as strd which is
>>> kind of okay, but there is register pair spilled which was not there 
>>> previously:
>>>
>>> aligned_dest:
>>>     @ args = 0, pretend = 0, frame = 0
>>>     @ frame_needed = 0, uses_anonymous_args = 0
>>>     @ link register save eliminated.
>>>     strd    r4, [sp, #-8]!
>>>     ldr    r4, [r0]    @ unaligned
>>>     movw    r3, #:lower16:.LANCHOR0
>>>     ldr    r5, [r0, #4]    @ unaligned
>>>     movt    r3, #:upper16:.LANCHOR0
>>>     strd    r4, [r3]
>>>     ldr    r2, [r0, #8]    @ unaligned
>>>     str    r2, [r3, #8]
>>>     ldrh    r2, [r0, #12]    @ unaligned
>>>     strh    r2, [r3, #12]    @ movhi
>>>     ldrb    r2, [r0, #14]    @ zero_extendqisi2
>>>     strb    r2, [r3, #14]
>>>     ldrd    r4, [sp]
>>>     add    sp, sp, #8
>>>     bx    lr
>>>
>>> The patch filters out ldrd/strd that sp-relative, and adds a few
>>> missing scan-assembler rules, in order to make the test results more stable.
>>>
>>> Alternative could be to use two movsi instead of the unaligned_load/storedi,
>>> but that would end up in ldm/stm which looks plain wrong to me.
>>>
>>>
>>> Tested using various arm-linux-gnueabihf cross-compilers.
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>> 2019-09-03  Bernd Edlinger  <bernd.edlin...@hotmail.de>
>>
>>     PR target/91614
>>     * gcc.target/arm/unaligned-memcpy-2.c: Adjust test expectations.
>>     * gcc.target/arm/unaligned-memcpy-3.c: Likewise.
>>
>> Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c    (Revision 275341)
>> +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-2.c    (Arbeitskopie)
>> @@ -15,9 +15,11 @@
>>     loads/stores for the remainder.  */
>>
>>  /* { dg-final { scan-assembler-times "ldmia" 0 } } */
>> -/* { dg-final { scan-assembler-times "ldrd" 0 } } */
>> +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 0 } } */
>> +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */
>>  /* { dg-final { scan-assembler-times "stmia" 1 { target { ! { 
>> arm_prefer_ldrd_strd } } } } } */
>> -/* { dg-final { scan-assembler-times "strd" 1 { target { 
>> arm_prefer_ldrd_strd } } } } */
>> +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 1 { target { 
>> arm_prefer_ldrd_strd } } } } */
>> +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */
>>  /* { dg-final { scan-assembler-times "ldrh" 1 } } */
>>  /* { dg-final { scan-assembler-times "strh" 1 } } */
>>  /* { dg-final { scan-assembler-times "ldrb" 1 } } */
>> Index: gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c
>> ===================================================================
>> --- gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c    (Revision 275341)
>> +++ gcc/testsuite/gcc.target/arm/unaligned-memcpy-3.c    (Arbeitskopie)
>> @@ -15,10 +15,12 @@
>>     loads/stores for the remainder.  */
>>
>>  /* { dg-final { scan-assembler-times "ldmia" 1 { target { ! { 
>> arm_prefer_ldrd_strd } } } } } */
>> -/* { dg-final { scan-assembler-times "ldrd" 1 { target { 
>> arm_prefer_ldrd_strd } } } } */
>> -/* { dg-final { scan-assembler-times "strd" 0 } } */
>> -/* { dg-final { scan-assembler-times "stm" 0 } } */
>> -/* { dg-final { scan-assembler-times "ldrh" 1 { target { ! { 
>> arm_prefer_ldrd_strd } } } } } */
>> +/* { dg-final { scan-assembler-times "ldrd\(?!\[^\\n\]*sp\)" 1 { target { 
>> arm_prefer_ldrd_strd } } } } */
>> +/* { dg-final { scan-assembler-times "ldm\(?!ia\)" 0 } } */
>> +/* { dg-final { scan-assembler-times "stmia" 0 } } */
>> +/* { dg-final { scan-assembler-times "strd\(?!\[^\\n\]*sp\)" 0 } } */
>> +/* { dg-final { scan-assembler-times "stm\(?!ia\)" 0 } } */
>> +/* { dg-final { scan-assembler-times "ldrh" 1 } } */
>>  /* { dg-final { scan-assembler-times "strh" 1 } } */
>> -/* { dg-final { scan-assembler-times "ldrb" 1 { target { ! { 
>> arm_prefer_ldrd_strd } } } } } */
>> +/* { dg-final { scan-assembler-times "ldrb" 1 } } */
>>  /* { dg-final { scan-assembler-times "strb" 1 } } */
>>
>> Ug!  That's horrible!
>>
> 
> Yes, I know, this won't win a prize for it's beauty :-)
> 
> But to catch all possible corner cases can't be easy.
> 
>> Also, it's testing things we really shouldn't be concerned about.  For 
>> example, the ldrh+ldrb+strh+strb is probably better done as ldr [offset-1] + 
>> str [offset-1] to handle the tail (ie re-copying the byte 4th from the end). 
>>  When someone does that, this test will then needlessly fail.
>>
> 
> Okay, removed those then.  And scan for ldrd _or_ ldm,
> respectively strd _or stm, which both are acceptable since the
> aligned side is effectively 4-byte aligned.
> 
>> I'm not really convinced that this sort of thing can be properly tested with 
>> scan-assembler tests, there's just too much that might change. Perhaps we 
>> should just accept this and create a simple executable test that will fault 
>> if run on a target that doesn't support the code we generate...
>>
> 
> Yes, that would be good as well, but runtime tests can't test if a
> specific optimization like ldrd/strd is _not_ used when it should be.
> 
> 
> 
> So how about this?
> 
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 

Reply via email to