On Tue, Feb 7, 2012 at 11:00 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > On Mon, Feb 6, 2012 at 10:30 PM, Uros Bizjak <ubiz...@gmail.com> wrote: > >>> Hmm. Well, the only thing that's going to work for x86 is the >>> double-compare >>> elimination portion. >>> >>> If we want to use this pass for x86, then for 4.8 we should also fix the >>> discrepancy between the compare-elim canonical >>> >>> [(operate) >>> (set-cc)] >>> >>> and the combine canonical >>> >>> [(set-cc) >>> (operate)] >>> >>> (Because of the simplicity of the substitution in compare-elim, I prefer >>> the former as the canonical canonical.) >> >> You are probably referring to following testcase: >> >> --cut here-- >> int test (int a, int b) >> { >> int lt = a + b < 0; >> int eq = a + b == 0; >> if (lt) >> return 1; >> return eq; >> } >> --cut here-- >> >> where combine creates: >> >> Trying 8 -> 9: >> Successfully matched this instruction: >> (parallel [ >> (set (reg:CCZ 17 flags) >> (compare:CCZ (plus:SI (reg/v:SI 63 [ a ]) >> (reg/v:SI 64 [ b ])) >> (const_int 0 [0]))) >> (set (reg:SI 60 [ D.1710 ]) >> (plus:SI (reg/v:SI 63 [ a ]) >> (reg/v:SI 64 [ b ]))) >> ]) > > Attached patch teaches combine to swap operands of a double set > pattern and retries recognition. Also added are minimum > target-dependant changes to handle the testcase above. > > Unfortunately, compare elimination was not able to remove redundant > compare, although the testcase is carefully crafted to require only > sign flag to be valid. Following enters compare-elim pass: > > (insn 9 8 10 2 (parallel [ > (set (reg:SI 5 di [orig:60 D.1710 ] [60]) > (plus:SI (reg/v:SI 5 di [orig:63 a ] [63]) > (reg/v:SI 4 si [orig:64 b ] [64]))) > (set (reg:CCZ 17 flags) > (compare:CCZ (plus:SI (reg/v:SI 5 di [orig:63 a ] [63]) > (reg/v:SI 4 si [orig:64 b ] [64])) > (const_int 0 [0]))) > ]) cmp.c:4 261 {*addsi_2} > (nil)) > > (note 10 9 33 2 NOTE_INSN_DELETED) > > (insn 33 10 34 2 (set (reg:QI 1 dx [65]) > (eq:QI (reg:CCZ 17 flags) > (const_int 0 [0]))) cmp.c:4 595 {*setcc_qi} > (nil)) > > (insn 34 33 30 2 (set (reg:SI 1 dx [65]) > (zero_extend:SI (reg:QI 1 dx [65]))) cmp.c:4 123 > {*zero_extendqisi2_movzbl} > (nil)) > > (insn 30 34 29 2 (set (reg/v:SI 0 ax [orig:59 eq ] [59]) > (const_int 1 [0x1])) cmp.c:6 64 {*movsi_internal} > (expr_list:REG_EQUAL (const_int 1 [0x1]) > (nil))) > > (insn 29 30 31 2 (set (reg:CCGOC 17 flags) > (compare:CCGOC (reg:SI 5 di [orig:60 D.1710 ] [60]) > (const_int 0 [0]))) cmp.c:6 2 {*cmpsi_ccno_1} > (nil)) > > (insn 31 29 25 2 (set (reg/v:SI 0 ax [orig:59 eq ] [59]) > (if_then_else:SI (ge (reg:CCGOC 17 flags) > (const_int 0 [0])) > (reg:SI 1 dx [65]) > (reg/v:SI 0 ax [orig:59 eq ] [59]))) cmp.c:6 903 {*movsicc_noc} > (nil)) > > The resulting code still includes redundant test that sets sign flag: > > test: > addl %esi, %edi > movl $1, %eax > sete %dl > >> testl %edi, %edi > movzbl %dl, %edx > cmovns %edx, %eax > ret > > (BTW: I think that the change to combine.c would be nice to have, to > find more other combine opportunities. I will propose the patch > separately.)
Shouldn't there be a canonical order for parallels throughout the whole compiler? Maybe just enforced by gen_rtx_PARALLEL / RTL checking? At least as far as I understand "execution order" of insns inside a PARALLEL is undefined. Richard. > Uros.