On 15/12/16 09:55, Richard Earnshaw (lists) wrote: > On 30/11/16 16:47, Kyrill Tkachov wrote: >> Hi all, >> >> In this awkward ICE we have a *load_multiple pattern that is being >> transformed in reload from: >> (insn 55 67 151 3 (parallel [ >> (set (reg:SI 0 r0) >> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) >> (set (reg:SI 158 [ c+4 ]) >> (mem/u/c:SI (plus:SI (reg/f:SI 147) >> (const_int 4 [0x4])) [2 c+4 S4 A32])) >> ]) arm-crash.c:25 393 {*load_multiple} >> (expr_list:REG_UNUSED (reg:SI 0 r0) >> (nil))) >> >> >> into the invalid: >> (insn 55 67 70 3 (parallel [ >> (set (reg:SI 0 r0) >> (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32])) >> (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) >> (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 >> S4 A32]) >> (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147]) >> (const_int 4 [0x4])) [2 c+4 S4 A32])) >> ]) arm-crash.c:25 393 {*load_multiple} >> (nil)) >> >> The operands of *load_multiple are not validated through constraints >> like LRA is used to, but rather through >> a match_parallel predicate which ends up calling ldm_stm_operation_p to >> validate the multiple sets. >> But this means that LRA cannot reason about the constraints properly. >> This two-regiseter load should not have used *load_multiple anyway, it >> should have used *ldm2_ from ldmstm.md >> and indeed it did until the loop2_invariant pass which copied the ldm2_ >> pattern: >> (insn 27 23 28 4 (parallel [ >> (set (reg:SI 0 r0) >> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) >> (set (reg:SI 1 r1) >> (mem/u/c:SI (plus:SI (reg/f:SI 147) >> (const_int 4 [0x4])) [2 c+4 S4 A32])) >> ]) "ldm.c":25 385 {*ldm2_} >> (nil)) >> >> into: >> (insn 55 19 67 3 (parallel [ >> (set (reg:SI 0 r0) >> (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) >> (set (reg:SI 158) >> (mem/u/c:SI (plus:SI (reg/f:SI 147) >> (const_int 4 [0x4])) [2 c+4 S4 A32])) >> ]) "ldm.c":25 404 {*load_multiple} >> (expr_list:REG_UNUSED (reg:SI 0 r0) >> (nil))) >> >> Note that it now got recognised as load_multiple because the second >> register is not a hard register but the pseudo 158. >> In any case, the solution suggested in the PR (and I agree with it) is >> to restrict *load_multiple to after reload. >> The similar pattern *load_multiple_with_writeback also has a similar >> condition and the comment above *load_multiple says that >> it's used to generate epilogues, which is done after reload anyway. For >> pre-reload load-multiples the patterns in ldmstm.md >> should do just fine. >> >> Bootstrapped and tested on arm-none-linux-gnueabihf. >> >> Ok for trunk? >> > > I don't think this is right. Firstly, these patterns look to me like > the ones used for memcpy expansion, so not recognizing them could lead > to compiler aborts. > > Secondly, the bug is when we generate > > (insn 55 67 70 3 (parallel [ > (set (reg:SI 0 r0) > (mem/u/c:SI (reg/f:SI 5 r5 [147]) [2 c+0 S4 A32])) > (set (mem/c:SI (plus:SI (reg/f:SI 102 sfp) > (const_int -4 [0xfffffffffffffffc])) [4 %sfp+-12 > S4 A32]) > (mem/u/c:SI (plus:SI (reg/f:SI 5 r5 [147]) > (const_int 4 [0x4])) [2 c+4 S4 A32])) > ]) arm-crash.c:25 393 {*load_multiple} > (nil))
sorry, pasted the wrong bit of code. That should read when we generate: (insn 55 19 67 3 (parallel [ (set (reg:SI 0 r0) (mem/u/c:SI (reg/f:SI 147) [2 c+0 S4 A32])) (set (reg:SI 158) (mem/u/c:SI (plus:SI (reg/f:SI 147) (const_int 4 [0x4])) [2 c+4 S4 A32])) ]) "ldm.c":25 404 {*load_multiple} (expr_list:REG_UNUSED (reg:SI 0 r0) (nil))) ie when we put a pseudo into the register load list. > > These patterns are supposed to enforce that the load (store) target > register is a hard register that is higher numbered than the hard > register in the memory slot that precedes it (thus satisfying the > constraints for a normal ldm/stm instruction. > > The real question is why did ldm_stm_operation_p permit this modified > pattern through, or was the pattern validation bypassed incorrectly by > the loop2 invariant code when it copied the insn and made changes? > > R. > >> Thanks, >> Kyrill >> >> 2016-11-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> >> >> PR target/71436 >> * config/arm/arm.md (*load_multiple): Add reload_completed to >> matching condition. >> >> 2016-11-30 Kyrylo Tkachov <kyrylo.tkac...@arm.com> >> >> PR target/71436 >> * gcc.c-torture/compile/pr71436.c: New test. >> >> arm-ldm.patch >> >> >> commit 996d28e2353badd1b29ef000f94d40c7dab9010f >> Author: Kyrylo Tkachov <kyrylo.tkac...@arm.com> >> Date: Tue Nov 29 15:07:30 2016 +0000 >> >> [ARM] Restrict *load_multiple pattern till after LRA >> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md >> index 74c44f3..22d2a84 100644 >> --- a/gcc/config/arm/arm.md >> +++ b/gcc/config/arm/arm.md >> @@ -11807,12 +11807,15 @@ (define_insn "<crc_variant>" >> >> ;; Patterns in ldmstm.md don't cover more than 4 registers. This pattern >> covers >> ;; large lists without explicit writeback generated for APCS_FRAME epilogue. >> +;; The operands are validated through the load_multiple_operation >> +;; match_parallel predicate rather than through constraints so enable it >> only >> +;; after reload. >> (define_insn "*load_multiple" >> [(match_parallel 0 "load_multiple_operation" >> [(set (match_operand:SI 2 "s_register_operand" "=rk") >> (mem:SI (match_operand:SI 1 "s_register_operand" "rk"))) >> ])] >> - "TARGET_32BIT" >> + "TARGET_32BIT && reload_completed" >> "* >> { >> arm_output_multireg_pop (operands, /*return_pc=*/false, >> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr71436.c >> b/gcc/testsuite/gcc.c-torture/compile/pr71436.c >> new file mode 100644 >> index 0000000..ab08d5d >> --- /dev/null >> +++ b/gcc/testsuite/gcc.c-torture/compile/pr71436.c >> @@ -0,0 +1,35 @@ >> +/* PR target/71436. */ >> + >> +#pragma pack(1) >> +struct S0 >> +{ >> + volatile int f0; >> + short f2; >> +}; >> + >> +void foo (struct S0 *); >> +int a, d; >> +static struct S0 b[5]; >> +static struct S0 c; >> +void fn1 (); >> +void >> +main () >> +{ >> + { >> + struct S0 e; >> + for (; d; fn1 ()) >> + { >> + { >> + a = 3; >> + for (; a >= 0; a -= 1) >> + { >> + { >> + e = c; >> + } >> + b[a] = e; >> + } >> + } >> + } >> + } >> + foo (b); >> +} >> >