On Wed, 5 Oct 2016, Jeff Law wrote:

> On 07/08/2016 05:17 AM, Richard Biener wrote:
> > On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote:
> > 
> > > Hi Richard,
> > > For the following test-case:
> > > 
> > > int f(int x, int y)
> > > {
> > >    int ret;
> > > 
> > >    if (x == y)
> > >      ret = x ^ y;
> > >    else
> > >      ret = 1;
> > > 
> > >    return ret;
> > > }
> > > 
> > > I was wondering if x ^ y should be folded to 0 since
> > > it's guarded by condition x == y ?
> > > 
> > > optimized dump shows:
> > > f (int x, int y)
> > > {
> > >   int iftmp.0_1;
> > >   int iftmp.0_4;
> > > 
> > >   <bb 2>:
> > >   if (x_2(D) == y_3(D))
> > >     goto <bb 3>;
> > >   else
> > >     goto <bb 4>;
> > > 
> > >   <bb 3>:
> > >   iftmp.0_4 = x_2(D) ^ y_3(D);
> > > 
> > >   <bb 4>:
> > >   # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)>
> > >   return iftmp.0_1;
> > > 
> > > }
> > > 
> > > The attached patch tries to fold for above case.
> > > I am checking if op0 and op1 are equal using:
> > > if (bitmap_intersect_p (vr1->equiv, vr2->equiv)
> > >    && operand_equal_p (vr1->min, vr1->max)
> > >    && operand_equal_p (vr2->min, vr2->max))
> > >   { /* equal /* }
> > > 
> > > I suppose intersection would check if op0 and op1 have equivalent ranges,
> > > and added operand_equal_p check to ensure that there is only one
> > > element within the range. Does that look correct ?
> > > Bootstrap+test in progress on x86_64-unknown-linux-gnu.
> > 
> > I think VRP is the wrong place to catch this and DOM should have but it
> > does
> > 
> > Optimizing block #3
> > 
> > 1>>> STMT 1 = x_2(D) le_expr y_3(D)
> > 1>>> STMT 1 = x_2(D) ge_expr y_3(D)
> > 1>>> STMT 1 = x_2(D) eq_expr y_3(D)
> > 1>>> STMT 0 = x_2(D) ne_expr y_3(D)
> > 0>>> COPY x_2(D) = y_3(D)
> > 0>>> COPY y_3(D) = x_2(D)
> > Optimizing statement ret_4 = x_2(D) ^ y_3(D);
> >   Replaced 'x_2(D)' with variable 'y_3(D)'
> >   Replaced 'y_3(D)' with variable 'x_2(D)'
> >   Folded to: ret_4 = x_2(D) ^ y_3(D);
> > LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D)
> > 
> > heh, registering both reqivalencies is obviously not going to help...
> > 
> > The 2nd equivalence is from doing
> > 
> >       /* We already recorded that LHS = RHS, with canonicalization,
> >          value chain following, etc.
> > 
> >          We also want to record RHS = LHS, but without any
> > canonicalization
> >          or value chain following.  */
> >       if (TREE_CODE (rhs) == SSA_NAME)
> >         const_and_copies->record_const_or_copy_raw (rhs, lhs,
> >                                                     SSA_NAME_VALUE (rhs));
> > 
> > generally recording both is not helpful.  Jeff?  This seems to be
> > r233207 (fix for PR65917) which must have regressed this testcase.
> The primary purpose of the const/copies table is to enable simple const/copy
> propagation.  So given a = b, we record that "a" can be replaced with "b" and
> from then on we replace occurrences of "a" with "b" -- ie, copy propagation.
> 
> We have a much lower priority goal of recording equivalences that arise from
> conditionals such as if (a == b) and using those in const/copy propagation.
> Given the structure of the table, we can record that a = b or b = a.  The
> problem is we've never come up with any good heuristics for which of those two
> styles to record.  Which you got was mostly a function of canonicalization
> using SSA_NAME_VERSIONs.
> 
> As seen in 65917, sometimes that decision is wrong and can lead to performance
> regressions.  But it's really just luck of the draw whether it gets optimized
> or not.  And the test
> 
> 65917 allows us to record both equivalences in the table.  We do that by
> bypassing all the checks/canonicalizations with a raw insertion method. During
> DOM we can now lookup either form which allows us to fix the regression in
> 65917.
> 
> It ought to be easy to fold x ^ y to zero when x == y (famous last words).
> I'm sure I'll regret saying that when I go to look at how to twiddle DOM
> appropriately.

Interesting idea.  Though it get's (theoretically) interesting for
ternary ops where you'd need to try 6 variants then ... thus in the
end it's a hack ...

That said, having SSA_NAME_VALUE (SSA_NAME_VALUE (t)) == t looks like
a bug in the lattice to me.

Richard.

Reply via email to