Re: [PATCH, tree-optimization] Fix for PR 52868

Igor Zamyatin Mon, 28 May 2012 13:19:24 -0700

Ping?


On Sat, May 12, 2012 at 1:26 AM, Igor Zamyatin <izamya...@gmail.com> wrote:
> Ping?
>
> On Fri, Apr 27, 2012 at 4:42 PM, Igor Zamyatin <izamya...@gmail.com> wrote:
>> On Wed, Apr 25, 2012 at 6:41 PM, Richard Guenther
>> <richard.guent...@gmail.com> wrote:
>>> On Wed, Apr 25, 2012 at 4:32 PM, Igor Zamyatin <izamya...@gmail.com> wrote:
>>>> On Wed, Apr 25, 2012 at 1:14 PM, Richard Guenther
>>>> <richard.guent...@gmail.com> wrote:
>>>>> On Wed, Apr 25, 2012 at 10:56 AM, Igor Zamyatin <izamya...@gmail.com> 
>>>>> wrote:
>>>>>> Hi all!
>>>>>>
>>>>>> I'd like to post for review the patch which makes some costs adjusting
>>>>>> in get_computation_cost_at routine in ivopts part.
>>>>>> As mentioned in the PR changes also fix the bwaves regression from PR 
>>>>>> 52272.
>>>>>> Also changes introduce no degradations on spec2000/2006 and
>>>>>> EEMBC1.1/2.0(this was measured on Atom) on x86
>>>>>>
>>>>>>
>>>>>> Bootstrapped and regtested on x86. Ok to commit?
>>>>>
>>>>> I can't make sense of the patch and the comment does not help.
>>>>>
>>>>> +      diff_cost = cost.cost;
>>>>>       cost.cost /= avg_loop_niter (data->current_loop);
>>>>> +      add_cost_val = add_cost (TYPE_MODE (ctype), data->speed);
>>>>> +      /* Do cost correction if address cost is small enough
>>>>> +         and difference cost is high enough.  */
>>>>> +      if (address_p && diff_cost > add_cost_val
>>>>> +          && get_address_cost (symbol_present, var_present,
>>>>> +                               offset, ratio, cstepi,
>>>>> +                               TYPE_MODE (TREE_TYPE (utype)),
>>>>> +                               TYPE_ADDR_SPACE (TREE_TYPE (utype)),
>>>>> +                               speed, stmt_is_after_inc,
>>>>> +                               can_autoinc).cost <= add_cost_val)
>>>>> +        cost.cost += add_cost_val;
>>>>>
>>>>> Please explain more thoroughly.  It also would seem to be better to add
>>>>> an extra case, as later code does
>>>>
>>>> For example for such code
>>>>
>>>>   for (j=0; j<M;j++) {
>>>>       for (i=0; i<N; i++)
>>>>           sum += ptr->a[j][i] * ptr->c[k][i];
>>>>   }
>>>>  we currently have following gimple on x86 target (I provided a piece
>>>> of all phase output):
>>>>
>>>>           # ivtmp.13_30 = PHI <ivtmp.13_31(3), ivtmp.13_33(7)>
>>>>           D.1748_34 = (void *) ivtmp.13_30;
>>>>           D.1722_7 = MEM[base: D.1748_34, offset: 0B];
>>>>           D.1750_36 = ivtmp.27_28;
>>>>           D.1751_37 = D.1750_36 + ivtmp.13_30; <-- we got
>>>> non-invariant add which is not taken into account currently in cost
>>>> model
>>>>           D.1752_38 = (void *) D.1751_37;
>>>>           D.1753_39 = (sizetype) k_8(D);
>>>>           D.1754_40 = D.1753_39 * 800;
>>>>           D.1723_9 = MEM[base: D.1752_38, index: D.1754_40, offset: 
>>>> 16000B];
>>>>           ...
>>>>
>>>>  With proposed fix we produce:
>>>>
>>>>           # ivtmp.14_30 = PHI <ivtmp.14_31(3), 0(7)>
>>>>           D.1749_34 = (struct S *) ivtmp.25_28;
>>>>           D.1722_7 = MEM[base: D.1749_34, index: ivtmp.14_30, offset: 0B];
>>>>           D.1750_35 = (sizetype) k_8(D);
>>>>           D.1751_36 = D.1750_35 * 800;
>>>>           D.1752_37 = ptr_6(D) + D.1751_36;
>>>>           D.1723_9 = MEM[base: D.1752_37, index: ivtmp.14_30, offset: 
>>>> 16000B];
>>>>
>>>> which is more effective on platforms where address cost is cheaper
>>>> than cost of addition operation. That's basically what this adjustment
>>>> is for.
>>>
>>> If we generally miss to account for the add then why is the adjustment
>>> conditional on diff_cost > add_cost and address_cost <= add_cost?
>>>
>>> Is this a new heuristic or a fix for not accurately computing the cost for 
>>> the
>>> stmts we generate?
>>
>> I'd say this is closer to heuristic since diff_cost > add_cost is an
>> attempt to catch the case with non-invariant add produced by pointer
>> difference and address_cost <=add_cost leaves the cases with cheap
>> address operations
>>
>>>
>>> Richard.
>>>
>>>> So comment in the source code now looks as follows
>>>>
>>>> /* Do cost correction when address difference produces
>>>>   additional non-invariant add operation which is less
>>>>   profitable if address cost is cheaper than cost of add.  */
>>>>
>>>>>
>>>>>  /* Now the computation is in shape symbol + var1 + const + ratio * var2.
>>>>>     (symbol/var1/const parts may be omitted).  If we are looking for an
>>>>>     address, find the cost of addressing this.  */
>>>>>  if (address_p)
>>>>>    return add_costs (cost,
>>>>>                      get_address_cost (symbol_present, var_present,
>>>>>                                        offset, ratio, cstepi,
>>>>>                                        TYPE_MODE (TREE_TYPE (utype)),
>>>>>                                        TYPE_ADDR_SPACE (TREE_TYPE 
>>>>> (utype)),
>>>>>                                        speed, stmt_is_after_inc,
>>>>>                                        can_autoinc));
>>>>>
>>>>> thus refactoring the code a bit would make it possible to CSE the
>>>>> get_address_cost
>>>>> call and eventually make it clearer what the code does.
>>>>
>>>> 'offset' could be changed beetween two calls of get_address_cost so
>>>> such refactoring looks useless.
>>>>
>>>> New patch (only the comment was changed) attached. Changelog was
>>>> changed as well.
>>>>
>>>>>
>>>>> Richard.
>>>>>
>>>>
>>>> Changelog:
>>>>
>>>>  2012-04-26  Yuri Rumyantsev  <yuri.rumyant...@intel.com>
>>>>
>>>>         * tree-ssa-loop-ivopts.c (get_computation_cost_at): Adjust
>>>>        cost model when address difference produces additional
>>>>        non-invariant add operation which is less profitable if
>>>>        address cost is cheaper than cost of add.
>>>>
>>>> Thanks,
>>>> Igor

Re: [PATCH, tree-optimization] Fix for PR 52868

Reply via email to