On Tue, Feb 7, 2012 at 11:00 AM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Mon, Feb 6, 2012 at 10:30 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
>
>>> Hmm.  Well, the only thing that's going to work for x86 is the 
>>> double-compare
>>> elimination portion.
>>>
>>> If we want to use this pass for x86, then for 4.8 we should also fix the
>>> discrepancy between the compare-elim canonical
>>>
>>>  [(operate)
>>>   (set-cc)]
>>>
>>> and the combine canonical
>>>
>>>  [(set-cc)
>>>   (operate)]
>>>
>>> (Because of the simplicity of the substitution in compare-elim, I prefer
>>> the former as the canonical canonical.)
>>
>> You are probably referring to following testcase:
>>
>> --cut here--
>> int test (int a, int b)
>> {
>>  int lt = a + b < 0;
>>  int eq = a + b == 0;
>>  if (lt)
>>    return 1;
>>  return eq;
>> }
>> --cut here--
>>
>> where combine creates:
>>
>> Trying 8 -> 9:
>> Successfully matched this instruction:
>> (parallel [
>>        (set (reg:CCZ 17 flags)
>>            (compare:CCZ (plus:SI (reg/v:SI 63 [ a ])
>>                    (reg/v:SI 64 [ b ]))
>>                (const_int 0 [0])))
>>        (set (reg:SI 60 [ D.1710 ])
>>            (plus:SI (reg/v:SI 63 [ a ])
>>                (reg/v:SI 64 [ b ])))
>>    ])
>
> Attached patch teaches combine to swap operands of a double set
> pattern and retries recognition. Also added are minimum
> target-dependant changes to handle the testcase above.
>
> Unfortunately, compare elimination was not able to remove redundant
> compare, although the testcase is carefully crafted to require only
> sign flag to be valid. Following enters compare-elim pass:
>
> (insn 9 8 10 2 (parallel [
>            (set (reg:SI 5 di [orig:60 D.1710 ] [60])
>                (plus:SI (reg/v:SI 5 di [orig:63 a ] [63])
>                    (reg/v:SI 4 si [orig:64 b ] [64])))
>            (set (reg:CCZ 17 flags)
>                (compare:CCZ (plus:SI (reg/v:SI 5 di [orig:63 a ] [63])
>                        (reg/v:SI 4 si [orig:64 b ] [64]))
>                    (const_int 0 [0])))
>        ]) cmp.c:4 261 {*addsi_2}
>     (nil))
>
> (note 10 9 33 2 NOTE_INSN_DELETED)
>
> (insn 33 10 34 2 (set (reg:QI 1 dx [65])
>        (eq:QI (reg:CCZ 17 flags)
>            (const_int 0 [0]))) cmp.c:4 595 {*setcc_qi}
>     (nil))
>
> (insn 34 33 30 2 (set (reg:SI 1 dx [65])
>        (zero_extend:SI (reg:QI 1 dx [65]))) cmp.c:4 123
> {*zero_extendqisi2_movzbl}
>     (nil))
>
> (insn 30 34 29 2 (set (reg/v:SI 0 ax [orig:59 eq ] [59])
>        (const_int 1 [0x1])) cmp.c:6 64 {*movsi_internal}
>     (expr_list:REG_EQUAL (const_int 1 [0x1])
>        (nil)))
>
> (insn 29 30 31 2 (set (reg:CCGOC 17 flags)
>        (compare:CCGOC (reg:SI 5 di [orig:60 D.1710 ] [60])
>            (const_int 0 [0]))) cmp.c:6 2 {*cmpsi_ccno_1}
>     (nil))
>
> (insn 31 29 25 2 (set (reg/v:SI 0 ax [orig:59 eq ] [59])
>        (if_then_else:SI (ge (reg:CCGOC 17 flags)
>                (const_int 0 [0]))
>            (reg:SI 1 dx [65])
>            (reg/v:SI 0 ax [orig:59 eq ] [59]))) cmp.c:6 903 {*movsicc_noc}
>     (nil))
>
> The resulting code still includes redundant test that sets sign flag:
>
> test:
>        addl    %esi, %edi
>        movl    $1, %eax
>        sete    %dl
>  >>   testl   %edi, %edi
>        movzbl  %dl, %edx
>        cmovns  %edx, %eax
>        ret
>
> (BTW: I think that the change to combine.c would be nice to have, to
> find more other combine opportunities. I will propose the patch
> separately.)

Shouldn't there be a canonical order for parallels throughout the whole
compiler?  Maybe just enforced by gen_rtx_PARALLEL / RTL checking?
At least as far as I understand "execution order" of insns inside a PARALLEL
is undefined.

Richard.

> Uros.

Reply via email to