On Wed, 2012-03-21 at 10:33 +0100, Richard Guenther wrote:
> On Mon, Mar 19, 2012 at 2:19 AM, Andrew Pinski <pins...@gmail.com> wrote:
> > On Sun, Mar 18, 2012 at 6:12 PM, William J. Schmidt
> > <wschm...@linux.vnet.ibm.com> wrote:
> >> Greetings,
> >>
> >> Now that we're into stage 1 again, I'd like to submit the first round of
> >> changes for dominator-based strength reduction, which will address
> >> issues from PR22586, PR35308, PR46556, and perhaps others.  I'm
> >> attaching two patches: the smaller (slsr-part1) is the patch I'm
> >> submitting for approval today, while the larger (slsr-fyi) is for
> >> reference only, but may be useful if questions arise about how the small
> >> patch fits into the intended whole.
> >>
> >> This patch contains the logic for identifying strength reduction
> >> candidates, and makes replacements only for those candidates where the
> >> stride is a fixed constant.  Replacement for candidates with fixed but
> >> unknown strides are not implemented herein, but that logic can be viewed
> >> in the larger patch.  This patch does not address strength reduction of
> >> data reference expressions, or candidates with conditional increments;
> >> those issues will be dealt with in future patches.
> >>
> >> The cost model is built on the one used by tree-ssa-ivopts.c, and I've
> >> added some new instruction costs to that model in place.  It might
> >> eventually be good to divorce that modeling code from IVOPTS, but that's
> >> an orthogonal patch and somewhat messy.
> >
> > I think this is the wrong way to do straight line strength reduction
> > considering we have a nice value numbering system which should be easy
> > to extended to support it.
> 
> Well, it is easy to handle very specific easy cases like
> 
> a = i * 2;
> b = i * 3;
> c = i * 4;
> 
> to transform it to
> 
> a = i * 2;
> b = a + i;
> c = b + i;
> 
> but already
> 
> a = i * 2;
> b = i * 4;
> c = i * 6;
> 
> would need extra special code.  The easy case could be handled in eliminate ()
> by, when seeing A * CST, looking up A * (CST - 1) and if that
> succeeds, transform
> it to VAL + A.  Cost issues are increasing the lifetime of VAL.  I've done 
> this
> simple case at some point, but it failed to handle the common associated 
> cases,
> when we transform (a + 1) * 2, (a + 1) * 3, etc. to a * 2 + 2, a * 3 +
> 3, etc.  I think
> it is the re-association in case of a strength-reduction opportunity
> that makes the
> separate pass better?  How would you suggest handling this case in the
> VN framework?  Detect the a * 3 + 3 pattern and then do two lookups, one for
> a * 2 and one for val + 2?  But then we still don't have a value for a + 1
> to re-use ...

And it becomes even more difficult with more complex scenarios.
Consider:

a = x + (3 * s);
b = x + (5 * s);
c = x + (7 * s);

The framework I've developed recognizes that this group of instructions
is related, and that it is profitable to replace them as follows:

a = x + (3 * s);
t = 2 * s;
b = a + t;
c = b + t;

The introduced multiply by 2 (one shift) is far cheaper than the two
multiplies that it replaces.  However, suppose you have instead:

a = x + (2 * s);
b = x + (8 * s);

Now it isn't profitable to replace this by:

a = x + (2 * s);
t = 6 * s;
b = a + t;

since a multiply by 6 (2 shifts, one add) is more costly than a multiply
by 8 (one shift).  To make these decisions correctly requires analyzing
all the related statements together, which value numbering as it stands
is not equipped to do.  Logic to handle these cases is included in my
larger "fyi" patch.

As another example, consider conditionally-executed increments:

a = i * 5;
if (...)
  i = i + 1;
b = i * 5;

This can be correctly and profitably strength-reduced as:

a = i * 5;
t = a;
if (...)
  {
    i = i + 1;
    t = t + 5;
  }
b = t;

(This is an approximation to the actual phi representation, which I've
omitted for clarity.)  Again, this kind of analysis is not something
that fits naturally into value numbering.  I don't yet have this in the
"fyi" patch, but have it largely working in a private version.

My conclusion is that if strength reduction is done in value numbering,
it must either be a very limited form of strength reduction, or the kind
of logic I've developed that considers chains of related candidates
together must be "glued onto" value numbering.  I think the latter would
be a mistake, as it would introduce much unnecessary complexity to what
is currently a very clean approach to PRE; the strength reduction would
become an ugly wart that people would complain about.  I think it's far
cleaner to keep the two issues separate.

> 
> Bill, experimenting with pattern detection in eliminate () would be a
> possibility.

For the reasons expressed above, I don't think that would get very far
or make anyone very happy...

I appreciate Andrew's view that value numbering is a logical place to do
strength reduction, but after considering the problem over the last few
months I have to disagree.  If you don't mind, at this point I would
prefer to have my current patch considered on its merits.

Thanks,
Bill

> 
> Thanks,
> Richard.
> 
> 
> 
> > Thanks,
> > Andrew pinski
> >
> >
> >>
> >> Thanks,
> >> Bill
> >>
> >>
> >> gcc:
> >>
> >> 2012-03-18  Bill Schmidt  <wschm...@linux.vnet.ibm.com>
> >>
> >>        * tree-pass.h (pass_strength_reduction): New decl.
> >>        * tree-ssa-loop-ivopts.c (add_cost): Remove #undef; rename to
> >>        add_regs_cost.
> >>        (multiply_regs_cost): New function.
> >>        (add_const_cost): Likewise.
> >>        (extend_or_trunc_cost): Likewise.
> >>        (negate_cost): Likewise.
> >>        (get_address_cost): Rename add_cost to add_regs_cost.
> >>        (force_expr_to_var_cost): Likewise.
> >>        (get_computation_cost_at): Likewise.
> >>        (determine_iv_cost): Likewise.
> >>        * timevar.def (TV_TREE_SLSR): New timevar.
> >>        * tree-ssa-strength-reduction.c: New.
> >>        * tree-flow.h (add_regs_cost): New decl.
> >>        (multiply_regs_cost): Likewise.
> >>        (add_const_cost): Likewise.
> >>        (extend_or_trunc_cost): Likewise.
> >>        (negate_cost): Likewise.
> >>        * Makefile.in (tree-ssa-strength-reduction.o): New dependencies.
> >>        * passes.c (init_optimization_passes): Add pass_strength_reduction.
> >>
> >> gcc/testsuite:
> >>
> >> 2012-03-18  Bill Schmidt  <wschm...@linux.vnet.ibm.com>
> >>
> >>        * gcc.dg/tree-ssa/slsr-1.c: New test.
> >>        * gcc.dg/tree-ssa/slsr-2.c: Likewise.
> >>        * gcc.dg/tree-ssa/slsr-3.c: Likewise.
> >>        * gcc.dg/tree-ssa/slsr-4.c: Likewise.
> >>
> 

Reply via email to