On Wed, 5 Oct 2016, Jeff Law wrote: > On 07/08/2016 05:17 AM, Richard Biener wrote: > > On Fri, 8 Jul 2016, Prathamesh Kulkarni wrote: > > > > > Hi Richard, > > > For the following test-case: > > > > > > int f(int x, int y) > > > { > > > int ret; > > > > > > if (x == y) > > > ret = x ^ y; > > > else > > > ret = 1; > > > > > > return ret; > > > } > > > > > > I was wondering if x ^ y should be folded to 0 since > > > it's guarded by condition x == y ? > > > > > > optimized dump shows: > > > f (int x, int y) > > > { > > > int iftmp.0_1; > > > int iftmp.0_4; > > > > > > <bb 2>: > > > if (x_2(D) == y_3(D)) > > > goto <bb 3>; > > > else > > > goto <bb 4>; > > > > > > <bb 3>: > > > iftmp.0_4 = x_2(D) ^ y_3(D); > > > > > > <bb 4>: > > > # iftmp.0_1 = PHI <iftmp.0_4(3), 1(2)> > > > return iftmp.0_1; > > > > > > } > > > > > > The attached patch tries to fold for above case. > > > I am checking if op0 and op1 are equal using: > > > if (bitmap_intersect_p (vr1->equiv, vr2->equiv) > > > && operand_equal_p (vr1->min, vr1->max) > > > && operand_equal_p (vr2->min, vr2->max)) > > > { /* equal /* } > > > > > > I suppose intersection would check if op0 and op1 have equivalent ranges, > > > and added operand_equal_p check to ensure that there is only one > > > element within the range. Does that look correct ? > > > Bootstrap+test in progress on x86_64-unknown-linux-gnu. > > > > I think VRP is the wrong place to catch this and DOM should have but it > > does > > > > Optimizing block #3 > > > > 1>>> STMT 1 = x_2(D) le_expr y_3(D) > > 1>>> STMT 1 = x_2(D) ge_expr y_3(D) > > 1>>> STMT 1 = x_2(D) eq_expr y_3(D) > > 1>>> STMT 0 = x_2(D) ne_expr y_3(D) > > 0>>> COPY x_2(D) = y_3(D) > > 0>>> COPY y_3(D) = x_2(D) > > Optimizing statement ret_4 = x_2(D) ^ y_3(D); > > Replaced 'x_2(D)' with variable 'y_3(D)' > > Replaced 'y_3(D)' with variable 'x_2(D)' > > Folded to: ret_4 = x_2(D) ^ y_3(D); > > LKUP STMT ret_4 = x_2(D) bit_xor_expr y_3(D) > > > > heh, registering both reqivalencies is obviously not going to help... > > > > The 2nd equivalence is from doing > > > > /* We already recorded that LHS = RHS, with canonicalization, > > value chain following, etc. > > > > We also want to record RHS = LHS, but without any > > canonicalization > > or value chain following. */ > > if (TREE_CODE (rhs) == SSA_NAME) > > const_and_copies->record_const_or_copy_raw (rhs, lhs, > > SSA_NAME_VALUE (rhs)); > > > > generally recording both is not helpful. Jeff? This seems to be > > r233207 (fix for PR65917) which must have regressed this testcase. > The primary purpose of the const/copies table is to enable simple const/copy > propagation. So given a = b, we record that "a" can be replaced with "b" and > from then on we replace occurrences of "a" with "b" -- ie, copy propagation. > > We have a much lower priority goal of recording equivalences that arise from > conditionals such as if (a == b) and using those in const/copy propagation. > Given the structure of the table, we can record that a = b or b = a. The > problem is we've never come up with any good heuristics for which of those two > styles to record. Which you got was mostly a function of canonicalization > using SSA_NAME_VERSIONs. > > As seen in 65917, sometimes that decision is wrong and can lead to performance > regressions. But it's really just luck of the draw whether it gets optimized > or not. And the test > > 65917 allows us to record both equivalences in the table. We do that by > bypassing all the checks/canonicalizations with a raw insertion method. During > DOM we can now lookup either form which allows us to fix the regression in > 65917. > > It ought to be easy to fold x ^ y to zero when x == y (famous last words). > I'm sure I'll regret saying that when I go to look at how to twiddle DOM > appropriately.
Interesting idea. Though it get's (theoretically) interesting for ternary ops where you'd need to try 6 variants then ... thus in the end it's a hack ... That said, having SSA_NAME_VALUE (SSA_NAME_VALUE (t)) == t looks like a bug in the lattice to me. Richard.