Ping?
On Sat, May 12, 2012 at 1:26 AM, Igor Zamyatin <izamya...@gmail.com> wrote: > Ping? > > On Fri, Apr 27, 2012 at 4:42 PM, Igor Zamyatin <izamya...@gmail.com> wrote: >> On Wed, Apr 25, 2012 at 6:41 PM, Richard Guenther >> <richard.guent...@gmail.com> wrote: >>> On Wed, Apr 25, 2012 at 4:32 PM, Igor Zamyatin <izamya...@gmail.com> wrote: >>>> On Wed, Apr 25, 2012 at 1:14 PM, Richard Guenther >>>> <richard.guent...@gmail.com> wrote: >>>>> On Wed, Apr 25, 2012 at 10:56 AM, Igor Zamyatin <izamya...@gmail.com> >>>>> wrote: >>>>>> Hi all! >>>>>> >>>>>> I'd like to post for review the patch which makes some costs adjusting >>>>>> in get_computation_cost_at routine in ivopts part. >>>>>> As mentioned in the PR changes also fix the bwaves regression from PR >>>>>> 52272. >>>>>> Also changes introduce no degradations on spec2000/2006 and >>>>>> EEMBC1.1/2.0(this was measured on Atom) on x86 >>>>>> >>>>>> >>>>>> Bootstrapped and regtested on x86. Ok to commit? >>>>> >>>>> I can't make sense of the patch and the comment does not help. >>>>> >>>>> + diff_cost = cost.cost; >>>>> cost.cost /= avg_loop_niter (data->current_loop); >>>>> + add_cost_val = add_cost (TYPE_MODE (ctype), data->speed); >>>>> + /* Do cost correction if address cost is small enough >>>>> + and difference cost is high enough. */ >>>>> + if (address_p && diff_cost > add_cost_val >>>>> + && get_address_cost (symbol_present, var_present, >>>>> + offset, ratio, cstepi, >>>>> + TYPE_MODE (TREE_TYPE (utype)), >>>>> + TYPE_ADDR_SPACE (TREE_TYPE (utype)), >>>>> + speed, stmt_is_after_inc, >>>>> + can_autoinc).cost <= add_cost_val) >>>>> + cost.cost += add_cost_val; >>>>> >>>>> Please explain more thoroughly. It also would seem to be better to add >>>>> an extra case, as later code does >>>> >>>> For example for such code >>>> >>>> for (j=0; j<M;j++) { >>>> for (i=0; i<N; i++) >>>> sum += ptr->a[j][i] * ptr->c[k][i]; >>>> } >>>> we currently have following gimple on x86 target (I provided a piece >>>> of all phase output): >>>> >>>> # ivtmp.13_30 = PHI <ivtmp.13_31(3), ivtmp.13_33(7)> >>>> D.1748_34 = (void *) ivtmp.13_30; >>>> D.1722_7 = MEM[base: D.1748_34, offset: 0B]; >>>> D.1750_36 = ivtmp.27_28; >>>> D.1751_37 = D.1750_36 + ivtmp.13_30; <-- we got >>>> non-invariant add which is not taken into account currently in cost >>>> model >>>> D.1752_38 = (void *) D.1751_37; >>>> D.1753_39 = (sizetype) k_8(D); >>>> D.1754_40 = D.1753_39 * 800; >>>> D.1723_9 = MEM[base: D.1752_38, index: D.1754_40, offset: >>>> 16000B]; >>>> ... >>>> >>>> With proposed fix we produce: >>>> >>>> # ivtmp.14_30 = PHI <ivtmp.14_31(3), 0(7)> >>>> D.1749_34 = (struct S *) ivtmp.25_28; >>>> D.1722_7 = MEM[base: D.1749_34, index: ivtmp.14_30, offset: 0B]; >>>> D.1750_35 = (sizetype) k_8(D); >>>> D.1751_36 = D.1750_35 * 800; >>>> D.1752_37 = ptr_6(D) + D.1751_36; >>>> D.1723_9 = MEM[base: D.1752_37, index: ivtmp.14_30, offset: >>>> 16000B]; >>>> >>>> which is more effective on platforms where address cost is cheaper >>>> than cost of addition operation. That's basically what this adjustment >>>> is for. >>> >>> If we generally miss to account for the add then why is the adjustment >>> conditional on diff_cost > add_cost and address_cost <= add_cost? >>> >>> Is this a new heuristic or a fix for not accurately computing the cost for >>> the >>> stmts we generate? >> >> I'd say this is closer to heuristic since diff_cost > add_cost is an >> attempt to catch the case with non-invariant add produced by pointer >> difference and address_cost <=add_cost leaves the cases with cheap >> address operations >> >>> >>> Richard. >>> >>>> So comment in the source code now looks as follows >>>> >>>> /* Do cost correction when address difference produces >>>> additional non-invariant add operation which is less >>>> profitable if address cost is cheaper than cost of add. */ >>>> >>>>> >>>>> /* Now the computation is in shape symbol + var1 + const + ratio * var2. >>>>> (symbol/var1/const parts may be omitted). If we are looking for an >>>>> address, find the cost of addressing this. */ >>>>> if (address_p) >>>>> return add_costs (cost, >>>>> get_address_cost (symbol_present, var_present, >>>>> offset, ratio, cstepi, >>>>> TYPE_MODE (TREE_TYPE (utype)), >>>>> TYPE_ADDR_SPACE (TREE_TYPE >>>>> (utype)), >>>>> speed, stmt_is_after_inc, >>>>> can_autoinc)); >>>>> >>>>> thus refactoring the code a bit would make it possible to CSE the >>>>> get_address_cost >>>>> call and eventually make it clearer what the code does. >>>> >>>> 'offset' could be changed beetween two calls of get_address_cost so >>>> such refactoring looks useless. >>>> >>>> New patch (only the comment was changed) attached. Changelog was >>>> changed as well. >>>> >>>>> >>>>> Richard. >>>>> >>>> >>>> Changelog: >>>> >>>> 2012-04-26 Yuri Rumyantsev <yuri.rumyant...@intel.com> >>>> >>>> * tree-ssa-loop-ivopts.c (get_computation_cost_at): Adjust >>>> cost model when address difference produces additional >>>> non-invariant add operation which is less profitable if >>>> address cost is cheaper than cost of add. >>>> >>>> Thanks, >>>> Igor