On Fri, Mar 6, 2009 at 4:09 PM, Paolo Bonzini <[email protected]> wrote:
> Richard Guenther wrote:
>> On Fri, Mar 6, 2009 at 3:29 PM, Paolo Bonzini <[email protected]> wrote:
>>>> So while trapping variants can certainly be introduced it looks like
>>>> this task may be more difficult.
>>> I don't think you need to introduce trapping tree codes. You can
>>> introduce them directly in the front-end as
>>>
>>> s = x +nv y
>>
>> I think this should be
>>
>> s = x + y
>> (((s ^ x) & (s ^ y)) < 0) ? trap () : s
>>
>> otherwise the compiler can assume that for the following check
>> the addition did not overflow.
>
> Ah yeah I've not yet looked at the patches and I did not know which one
> was which. I actually wrote x + y first and then went back to carefully
> check them. :-P
>
>>> Making sure they are compiled efficiently is another story, but
>>> especially for the sake of LTO I think this is the way to go.
>>
>> I agree. Btw, for the addition case we generate
>>
>> leal (%rsi,%rdi), %eax
>> xorl %eax, %esi
>> xorl %eax, %edi
>> testl %edi, %esi
>> jns .L2
>> .value 0x0b0f
>> .L2:
>> rep
>> ret
>>
>> which isn't too bad.
>
> Well, for x86 it requires the addends to die.
>
> This is unfortunately four insns, and combine has a limit of three. but
> maybe you could make combine recognize the check and turn it to an addv
> pattern (with the add result unused!); and then CSE or maybe combine as
> well would, well, eliminate the duplicate ADD...
Well, I was thinking about detecting the pattern on the tree level instead.
s_6 = x.0_2 + y.1_4;
D.1597_7 = s_6 ^ x_1(D);
D.1598_8 = s_6 ^ y_3(D);
D.1599_9 = D.1597_7 & D.1598_8;
if (D.1599_9 < 0)
goto <bb 3>;
else
goto <bb 4>;
<bb 3>:
__builtin_trap ();
<bb 4>:
This should be recognizable in the ifcombine pass for example, which
recognizes CFG patterns. Transforming it to just
s_6 = __builtin_addv (x.0_2, y.1_4);
<bb 4>:
Only ifcombine runs a little too early for that maybe.
Richard.