On Wed, Nov 2, 2016 at 11:08 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Tue, Nov 1, 2016 at 4:24 PM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>> On Thu, Aug 11, 2016 at 11:38 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Thu, Aug 11, 2016 at 11:56 AM, Bin.Cheng <amker.ch...@gmail.com> wrote:
>>>> On Thu, Aug 11, 2016 at 10:50 AM, Richard Biener
>>>> <richard.guent...@gmail.com> wrote:
>>>>> On Wed, Aug 10, 2016 at 5:58 PM, Bin Cheng <bin.ch...@arm.com> wrote:
>>>>>> Hi,
>>>>>> Due to some reasons, tree-if-conv.c now factors floating point 
>>>>>> comparison out of cond_expr,
>>>>>> resulting in mixed types in it.  This does help CSE on common comparison 
>>>>>> operations.
>>>>>> Only problem is that test gcc.dg/vect/pr56541.c now requires 
>>>>>> vect_cond_mixed to be
>>>>>> vectorized.  This patch changes the test in that way.
>>>>>> Test result checked.  Is it OK?
>>>>>
>>>>> Hmm, I think the fix is to fix if-conversion not doing that.  Can you
>>>>> track down why this happens?
>>>> Hmm, but there are several common floating comparison operations in
>>>> the case, by doing this, we could do CSE on GIMPLE, otherwise we
>>>> depends on RTL optimizers.
>>>
>>> I see.
>>>
>>>> I thought we prefer GIMPLE level
>>>> transforms?
>>>
>>> Yes, but the vectorizer is happier with the conditions present in the 
>>> COND_EXPR
>>> and thus we concluded we always want to have them there.  forwprop will
>>> also aggressively put them back.  Note that we cannot put back
>>> tree_could_throw_p
>>> conditions (FP compares with signalling nans for example) to properly model 
>>> EH
>>> (though for VEC_COND_EXPRs we don't really care here).
>>>
>>> Note that nothing between if-conversion and vectorization will perform
>>> the desired
>>> CSE anyway.
>> Hi Richard,
>> I looked into this one and found it was not if-conv factors cond_expr
>> out.  For test case:
>>
>>   for (i=0; i!=1024; ++i)
>>     {
>>       float rR = a*z[i];
>>       float rL = b*z[i];
>>       float rMin = (rR<rL) ? rR : rL;
>>       float rMax = (rR<rL) ? rL : rR;
>>       rMin = (rMax>0) ? rMin : rBig;
>>       rMin = (rMin>0) ? rMin : rMax;
>>       ok[i] = rMin-c<rMax+d;
>>     }
>> Dump before jump threading is like:
>>
>>   <bb 7>:
>>   # iftmp.3_12 = PHI <rL_18(5), rR_17(6)>
>>   if (iftmp.3_12 > 0.0)
>>     goto <bb 9>;
>>   else
>>     goto <bb 8>;
>>
>>   <bb 8>:
>>
>>   <bb 9>:
>>   # iftmp.4_13 = PHI <iftmp.2_11(7), 1.5e+2(8)>
>>   if (iftmp.4_13 > 0.0)
>>     goto <bb 11>;
>>   else
>>     goto <bb 10>;
>>
>>   <bb 10>:
>>
>>   <bb 11>:
>>   # iftmp.5_14 = PHI <iftmp.4_13(9), iftmp.3_12(10)>
>>
>> Jump thread in dom pass threads edges (bb7 -> bb8 -> ... bb11) to (bb6
>> -> bb12 -> bb9) as below:
>>
>>   <bb 6>:
>>   # iftmp.3_12 = PHI <rL_18(4), rR_17(5), rL_18(11)>
>>   # iftmp.2_23 = PHI <iftmp.2_11(4), iftmp.2_11(5), iftmp.2_10(11)>
>>   if (iftmp.3_12 > 0.0)
>>     goto <bb 7>;
>>   else
>>     goto <bb 12>;
>>
>>   <bb 7>:
>>   # iftmp.4_13 = PHI <iftmp.2_23(6)>
>>   if (iftmp.4_13 > 0.0)
>>     goto <bb 9>;
>>   else
>>     goto <bb 8>;
>>
>>   <bb 8>:
>>
>>   <bb 9>:
>>   # iftmp.5_14 = PHI <iftmp.4_13(7), iftmp.3_12(8), 1.5e+2(12)>
>>
>>   //...
>>   <bb 12>:
>>   # iftmp.4_22 = PHI <1.5e+2(6)>
>>   goto <bb 9>;
>>
>> This transform saves one comparison on the path, but creates multi-arg
>> phi, resulting in cond_expr being factored out.
>>
>> Looks like threading corrupts vectorization opportunity for target
>> doesn't support vect_cond_mixed, but I guess it's hard to tell in
>> threading itself.  Any ideas?
>
> First of all I wonder why MIN/MAX are not pattern matched earlier...
> (I'd generally avoid threading through them).
Hmm, this case is not MIN/MAX cases, though it uses rMin/rMax as
variable names.  So it cannot be pattern matched.

As for below cases you mentioned.  Since I am introducing
more/advanced pattern in match.pd and adding call to fold_stmt in
fold_stmt, I think they can be addressed in this way too, I will keep
this in mind in my cond_expr pattern work.  So we don't need to go
that far to vector pattern matches.

Thanks,
bin
>
> But yes, I see that if-conversion can cause factored out conditons.
> Also for the simple case of ! condition:
>
> _49 = iftmp.3_12 > 0.0;
>   _50 = ~_49;
>   iftmp.5_14 = _50 ? 1.5e+2 : _ifc__48;
>
> but those should have been eliminated by swapping operands (looks like
> that doesn't reliably work).
>
> In the end we have the bool patterns in the vectorizer to recover from this,
> so I wonder why that doesn't work then.  It should re-write the above to
>
>  _111 = iftmp.3_12 > 0.0 ? 1 : 0;  // of proper type
>  _112 = ~ _111;
>  iftmp.5_14 = _112 != 0 ? ....;
>
> Richard.
>
>> Thanks,
>> bin

Reply via email to