We add an optimization phase in GCC, and our changes trigger GCC generates incorrect code during the combine phase. GCC incorrectly combine three instructions into two intructions:
from: insn_1 set (SI reg_a) (plus:SI (SI: reg_b) (const_int -1)) insn_2 set (SI reg_c) (ashiftrt:SI (SI: reg_a) (const_int 31)) insn_3 set (SI reg_d) (ior:SI (SI: reg_c) (const_int C)) into insn_1 set (SI reg_a) (plus:SI (SI: reg_b) (const_int -1)) insn_2 set (SI reg_c) (ashiftrt:SI (SI: reg_a) (const_int 31)) where the third instruction has been deleted incorrectly. The buggy code is in simplify_logical (rtx x) in combine.c: .... /* If OP0 is (ashiftrt (plus ...) C), it might actually be a (sign_extend (plus ...)). If so, OP1 is a CONST_INT, and the PLUS does not affect any of the bits in OP1, it can really be done as a PLUS and we can associate. We do this by seeing if OP1 can be safely shifted left C bits. */ if (GET_CODE (op1) == CONST_INT && GET_CODE (op0) == ASHIFTRT && GET_CODE (XEXP (op0, 0)) == PLUS && GET_CODE (XEXP (XEXP (op0, 0), 1)) == CONST_INT && GET_CODE (XEXP (op0, 1)) == CONST_INT && INTVAL (XEXP (op0, 1)) < HOST_BITS_PER_WIDE_INT) { int count = INTVAL (XEXP (op0, 1)); HOST_WIDE_INT mask = INTVAL (op1) << count; if (mask >> count == INTVAL (op1) && (mask & nonzero_bits (XEXP (op0, 0), mode)) == 0) { SUBST (XEXP (XEXP (op0, 0), 1), GEN_INT (INTVAL (XEXP (XEXP (op0, 0), 1)) | mask)); return op0; } } ... Where x = (ior:SI (ashiftrt:SI (plus:SI (reg:SI 127) (const_int -1 [0xffffffffffffffff])) (const_int 31 [0x1f])) (const_int 60 [0x3c])) After the simplification: x = op0 = (ashiftrt:SI (plus:SI (reg:SI 127) (const_int -1 [0xffffffffffffffff])) (const_int 31 [0x1f])) We think the buggy code is this line: ... (mask & nonzero_bits (XEXP (op0, 0), mode)) == 0 ^^^^ where nonzero_bits returns 0xffffffff. We think may be what wanted is: ... (INTVAL (op1) & nonzero_bits (XEXP (op0, 0), mode)) == 0 ^^^^^^^^^^^^ With this patch, the optimization for this case is undoable and should just be disabled. The following is the source code and rtl dumps before combine (test.c.26.life1) and after combine phase (test.c.27.combine). The option we uses: gcc -O3 -fno-inline (+with our optimization changes) The instructions to be combined are in foo() in test.c.26.life1: ... (insn 51 50 52 0 (set (reg:SI 128) (plus:SI (reg:SI 127) (const_int -1 [0xffffffffffffffff]))) 39 {*addsi3_internal1} (insn_l ist:REG_DEP_TRUE 50 (nil)) (expr_list:REG_DEAD (reg:SI 127) (nil))) (insn 52 51 54 0 (set (reg:SI 129) (ashiftrt:SI (reg:SI 128) (const_int 31 [0x1f]))) 134 {ashrsi3_no_power} (insn_list:REG_DEP_TR UE 51 (nil)) (expr_list:REG_DEAD (reg:SI 128) (nil))) (insn 54 52 26 0 (set (reg/v:SI 119 [ b ]) (ior:SI (reg:SI 129) (const_int 60 [0x3c]))) 85 {*boolsi3_internal1} (insn_list:REG_DEP_T RUE 52 (nil)) (expr_list:REG_DEAD (reg:SI 129) (nil))) ... ------------------- source code --------------- extern void abort (void); extern void exit (int); _Bool a; int foo () { int b; if (a) goto L1; else goto L0; L0: b = -1; goto L2; L1: b = 0x3c; L2: return b; } int main(int argc, char **argv) { a = 1; if (foo() != 0x3c) abort(); exit(0); } ---------------------------------------- rtl-dump: test.c.26.life1 ;; Function foo (foo) try_optimize_cfg iteration 1 try_optimize_cfg iteration 1 (note 2 0 6 NOTE_INSN_DELETED) (note 6 2 9 NOTE_INSN_FUNCTION_BEG) ;; Start of basic block 0, registers live: 1 [1] 2 [2] 31 [31] 67 [ap] 109 [vrsa ve] 110 [vscr] 113 [sfp] (note 9 6 11 0 [bb 0] NOTE_INSN_BASIC_BLOCK) (insn 11 9 12 0 (use (symbol_ref:DI ("a") <var_decl 0x2aaaaac0f4d0 a>)) -1 (nil) (nil)) (insn 12 11 13 0 (set (reg/f:DI 121 [ A8 ]) (zero_extend:DI (mem/u/c:SI (plus:DI (reg:DI 2 2) (const:DI (minus:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) (symbol_ref:DI ("*.LCTOC1"))))) [5 S8 A64]))) 2 {*ze ro_extendsidi2_internal1} (nil) (expr_list:REG_DEAD (reg:DI 2 2) (expr_list:REG_EQUAL (symbol_ref:DI ("a") <var_decl 0x2aaaaac0f4d0 a>) (nil)))) (insn 13 12 14 0 (set (reg:QI 123 [ a ]) (mem/c/i:QI (reg/f:DI 121 [ A8 ]) [3 a+0 S1 A8])) 257 {*movqi_internal} (insn_list:REG_DEP_TRUE 12 (nil)) (expr_list:REG_DEAD (reg/f:DI 121 [ A8 ]) (nil))) (insn 14 13 50 0 (set (reg:SI 122 [ a ]) (zero_extend:SI (reg:QI 123 [ a ]))) 20 {*rs6000.md:474} (insn_list:REG_ DEP_TRUE 13 (nil)) (expr_list:REG_DEAD (reg:QI 123 [ a ]) (nil))) (insn 50 14 51 0 (parallel [ (set (reg:SI 127) (abs:SI (reg:SI 122 [ a ]))) (clobber (scratch:SI)) ]) 54 {abssi2_nopower} (insn_list:REG_DEP_TRUE 14 (nil)) (expr_list:REG_DEAD (reg:SI 122 [ a ]) (expr_list:REG_UNUSED (scratch:SI) (nil)))) (insn 51 50 52 0 (set (reg:SI 128) (plus:SI (reg:SI 127) (const_int -1 [0xffffffffffffffff]))) 39 {*addsi3_internal1} (insn_l ist:REG_DEP_TRUE 50 (nil)) (expr_list:REG_DEAD (reg:SI 127) (nil))) (insn 52 51 54 0 (set (reg:SI 129) (ashiftrt:SI (reg:SI 128) (const_int 31 [0x1f]))) 134 {ashrsi3_no_power} (insn_list:REG_DEP_TR UE 51 (nil)) (expr_list:REG_DEAD (reg:SI 128) (nil))) (insn 54 52 26 0 (set (reg/v:SI 119 [ b ]) (ior:SI (reg:SI 129) (const_int 60 [0x3c]))) 85 {*boolsi3_internal1} (insn_list:REG_DEP_T RUE 52 (nil)) (expr_list:REG_DEAD (reg:SI 129) (nil))) (note 26 54 28 0 ("L2") NOTE_INSN_DELETED_LABEL 4) (insn 28 26 32 0 (set (reg:DI 126 [ b ]) (sign_extend:DI (reg/v:SI 119 [ b ]))) 17 {*rs6000.md:406} (insn_list:RE G_DEP_TRUE 54 (nil)) (expr_list:REG_DEAD (reg/v:SI 119 [ b ]) (nil))) (note 32 28 35 0 NOTE_INSN_FUNCTION_END) (insn 35 32 41 0 (set (reg/i:DI 3 3 [ <result> ]) (reg:DI 126 [ b ])) 274 {*movdi_internal64} (insn_list:REG_DEP_TRUE 28 ( nil)) (expr_list:REG_DEAD (reg:DI 126 [ b ]) (nil))) (insn 41 35 0 0 (use (reg/i:DI 3 3 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 35 (nil)) (nil)) ;; End of basic block 0, registers live: 1 [1] 3 [3] 31 [31] 67 [ap] 109 [vrsave] 110 [vscr] 113 [sfp] ------------------------------------------------------------ rtl-dump after combine: test.c.27.combine ;; Function foo (foo) insn_cost 11: 0 insn_cost 12: 8 insn_cost 13: 8 insn_cost 14: 4 insn_cost 50: 4 insn_cost 51: 4 insn_cost 52: 4 insn_cost 54: 4 insn_cost 28: 4 insn_cost 35: 4 insn_cost 41: 0 (note 2 0 6 NOTE_INSN_DELETED) (note 6 2 9 NOTE_INSN_FUNCTION_BEG) ;; Start of basic block 0, registers live: 1 [1] 2 [2] 31 [31] 67 [ap] 109 [vrsa ve] 110 [vscr] 113 [sfp] (note 9 6 11 0 [bb 0] NOTE_INSN_BASIC_BLOCK) (insn 11 9 12 0 (use (symbol_ref:DI ("a") <var_decl 0x2aaaaac0f4d0 a>)) -1 (nil) (nil)) (insn 12 11 13 0 (set (reg/f:DI 121 [ A8 ]) (zero_extend:DI (mem/u/c:SI (plus:DI (reg:DI 2 2) (const:DI (minus:DI (symbol_ref/u:DI ("*.LC0") [flags 0x2]) (symbol_ref:DI ("*.LCTOC1"))))) [5 S8 A64]))) 2 {*ze ro_extendsidi2_internal1} (nil) (expr_list:REG_DEAD (reg:DI 2 2) (expr_list:REG_EQUAL (symbol_ref:DI ("a") <var_decl 0x2aaaaac0f4d0 a>) (nil)))) (note 13 12 14 0 NOTE_INSN_DELETED) (note 14 13 50 0 NOTE_INSN_DELETED) (insn 50 14 51 0 (set (reg:SI 127) (zero_extend:SI (mem/c/i:QI (reg/f:DI 121 [ A8 ]) [3 a+0 S1 A8]))) 20 {* rs6000.md:474} (insn_list:REG_DEP_TRUE 12 (nil)) (expr_list:REG_DEAD (reg/f:DI 121 [ A8 ]) (nil))) (note 51 50 52 0 NOTE_INSN_DELETED) (insn 52 51 54 0 (set (reg:SI 129) (plus:SI (reg:SI 127) (const_int -1 [0xffffffffffffffff]))) 39 {*addsi3_internal1} (insn_l ist:REG_DEP_TRUE 50 (nil)) (expr_list:REG_DEAD (reg:SI 127) (nil))) (note 54 52 26 0 NOTE_INSN_DELETED) (note 26 54 28 0 ("L2") NOTE_INSN_DELETED_LABEL 4) (insn 28 26 32 0 (set (reg:DI 126 [ b ]) (ashift:DI (subreg:DI (reg:SI 129) 0) (const_int 32 [0x20]))) 224 {*ashldi3_internal1} (insn_list:REG_DEP_ TRUE 52 (nil)) (expr_list:REG_DEAD (reg:SI 129) (nil))) (note 32 28 35 0 NOTE_INSN_FUNCTION_END) (insn 35 32 41 0 (set (reg/i:DI 3 3 [ <result> ]) (ashiftrt:DI (reg:DI 126 [ b ]) (const_int 63 [0x3f]))) 236 {*ashrdi3_internal1} (insn_list:REG_DEP_ TRUE 28 (nil)) (expr_list:REG_DEAD (reg:DI 126 [ b ]) (nil))) (insn 41 35 0 0 (use (reg/i:DI 3 3 [ <result> ])) -1 (insn_list:REG_DEP_TRUE 35 (nil)) (nil)) ;; End of basic block 0, registers live: 1 [1] 3 [3] 31 [31] 67 [ap] 109 [vrsave] 110 [vscr] 113 [sfp] -- Summary: incorrect conversion of (ior (ashiftrt (plus ...))) in combine.c Product: gcc Version: 4.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: cqfu at transmeta dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29589