Hi All, This patch fixes a bug in one of the ARM peephole2 optimizations. The peephole2 optimization in question was changed to use the CC-updating form for all of the instructions produced by the peephole so that the encoding will be smaller when compiling for thumb [1]. However, I don't think that is always safe.
For example, the CC register might be used by something *after* the peephole window. The current peephole will transform: (insn:TI 7 49 18 2 (set (reg:CC 24 cc) (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn} (nil)) (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (reg:SI 3 r3 [140]) (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp} (expr_list:REG_EQUIV (const_int 0 [0]) (nil))) (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (reg:SI 3 r3 [138]) (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) (nil))) (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} (nil)) (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8]) (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} (expr_list:REG_DEAD (reg:CC 24 cc) (expr_list:REG_DEAD (reg:QI 3 r3 [140]) (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) (nil))))) into the following: (insn 59 49 60 2 (parallel [ (set (reg:CC 24 cc) (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) (set (reg:SI 1 r1) (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136]) (const_int 0 [0]))) ]) repro.c:6 -1 (nil)) (insn 60 59 61 2 (parallel [ (set (reg:CC 24 cc) (compare:CC (const_int 0 [0]) (reg:SI 1 r1))) (set (reg:SI 3 r3 [140]) (minus:SI (const_int 0 [0]) (reg:SI 1 r1))) ]) repro.c:6 -1 (nil)) (insn 61 60 19 2 (parallel [ (set (reg:SI 3 r3 [140]) (plus:SI (plus:SI (reg:SI 3 r3 [140]) (reg:SI 1 r1)) (geu:SI (reg:CC 24 cc) (const_int 0 [0])))) (clobber (reg:CC 24 cc)) ]) repro.c:6 -1 (nil)) (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp} (nil)) (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc) (const_int 0 [0])) (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32]) (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp} (expr_list:REG_DEAD (reg/f:SI 2 r2 [143]) (nil))) (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc) (const_int 0 [0])) (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8]) (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn} (expr_list:REG_DEAD (reg:CC 24 cc) (expr_list:REG_DEAD (reg:QI 3 r3 [140]) (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) (nil))))) This gets compiled into the incorrect sequence: ldrb r3, [r0, #0] ldr r2, .L4 subs r1, r3, #0 rsbs r3, r1, #0 adcs r3, r3, r1 strne r3, [r2, #0] streq r3, [r2, #0] strneb r3, [r0, #0] The conditional stores are now dealing with an incorrect condition state. This patch fixes the problem by ensuring that the CC reg is dead after the peephole window for the current peephole definition and falls back on the original pre-PR46975 peephole when it is live. Unfortunately I had trouble coming up with a reproduction case against mainline. I only noticed the bug while working with some local changes that exposed it. Built and tested a full ARM GNU/Linux toolchain. No regressions in the GCC test suite. OK? gcc/ 2013-05-29 Meador Inge <mead...@codesourcery.com> * config/arm/arm.md (conditional move peephole2): Only clobber CC register when it is dead after the peephole window. [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 199414) +++ gcc/config/arm/arm.md (working copy) @@ -9978,29 +9978,48 @@ ;; Attempt to improve the sequence generated by the compare_scc splitters ;; not to use conditional execution. (define_peephole2 - [(set (reg:CC CC_REGNUM) + [(set (match_operand 0 "cc_register" "") (compare:CC (match_operand:SI 1 "register_operand" "") (match_operand:SI 2 "arm_rhs_operand" ""))) (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) - (set (match_operand:SI 0 "register_operand" "") (const_int 0))) + (set (match_operand:SI 3 "register_operand" "") (const_int 0))) (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) - (set (match_dup 0) (const_int 1))) - (match_scratch:SI 3 "r")] - "TARGET_32BIT" + (set (match_dup 3) (const_int 1))) + (match_scratch:SI 4 "r")] + "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])" [(parallel [(set (reg:CC CC_REGNUM) (compare:CC (match_dup 1) (match_dup 2))) - (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) + (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC CC_REGNUM) - (compare:CC (const_int 0) (match_dup 3))) - (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) + (compare:CC (const_int 0) (match_dup 4))) + (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))]) (parallel - [(set (match_dup 0) - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) + [(set (match_dup 3) + (plus:SI (plus:SI (match_dup 3) (match_dup 4)) (geu:SI (reg:CC CC_REGNUM) (const_int 0)))) (clobber (reg:CC CC_REGNUM))])]) +(define_peephole2 + [(set (reg:CC CC_REGNUM) + (compare:CC (match_operand:SI 0 "register_operand" "") + (match_operand:SI 1 "arm_rhs_operand" ""))) + (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) + (set (match_operand:SI 2 "register_operand" "") (const_int 0))) + (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) + (set (match_dup 2) (const_int 1))) + (match_scratch:SI 3 "r")] + "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])" + [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1))) + (parallel + [(set (reg:CC CC_REGNUM) + (compare:CC (const_int 0) (match_dup 3))) + (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))]) + (set (match_dup 2) + (plus:SI (plus:SI (match_dup 2) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) + (define_insn "*cond_move" [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") (if_then_else:SI (match_operator 3 "equality_operator"