Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-07-05 Thread Tom de Vries
On 03/05/12 12:21, Richard Guenther wrote:
 On Wed, May 2, 2012 at 4:06 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27/04/12 11:01, Richard Guenther wrote:
 SNIP
 I see you do not handle
 SNIP
 struct S { int i; };
 struct S foo (void);
 struct S bar (void)
 {
   struct S s1, s2;
   if (...)
s = foo ();
   else
s = foo ();

 because the calls have a LHS that is not an SSA name.

 Indeed, the gvn patch handles this example conservatively, and 
 tree-tail-merge
 fails to optimize this test-case:
 ...
 struct S { int i; };
 extern struct S foo (void);
 extern int foo2 (void);
 struct S s;
 int bar (int c) {
  int r;
  if (c)
{
  s = foo ();
  r = foo2 ();
}
  else
{
  s = foo ();
  r = foo2 ();
}
  return r;
 }
 ...

 A todo.

 SNIP
 bootstrapped and reg-tested on x86_64 (ada inclusive).

 Is this patch ok, or is the todo required?

 No, you can followup with that.


 Richard,

 here is the follow-up patch, which adds value numbering of a call for which 
 the
 lhs is not an SSA_NAME.

 The only thing I ended up using from the patch in
 http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html was the idea of using
 MODIFY_EXPR.

 I don't include any handling of MODIFY_EXPR in 
 create_component_ref_by_pieces_1
 because I don't think it will trigger with PRE.

 bootstrapped and reg-tested on x86_64.

 Ok for trunk?
 
 Hmm, I wonder why
 
   if (!gimple_call_internal_p (stmt)
(gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
   /* If the call has side effects, subsequent calls won't have
  the same incoming vuse, so it's save to assume
  equality.  */
   || gimple_has_side_effects (stmt)))
 
 works - I realize you added the gimple_has_side_effects () call - but
 if you consider ECF_LOOPING_CONST_OR_PURE functions, which
 have no VDEF, then it's odd how the comment applies.  And together
 both tests turn out to let all calls pass.
 

Richard,

You're right, this is not correct. The test for gimple_has_side_effect should be
a test for gimple_vdef.
A ECF_LOOPING_CONST_OR_PURE function will be rejected by the updated condition.

I fixed this in the patch, and added comments describing both the const/pure
clause, and the vdef clause.

I also removed the comment 'We should handle stores from calls' since this patch
implements that.

 +  tree lhs = gimple_call_lhs (call);
 +
 +  if (lhs  TREE_CODE (lhs) != SSA_NAME)
 +{
 +  memset (temp, 0, sizeof (temp));
 +  temp.opcode = MODIFY_EXPR;
 +  temp.type = TREE_TYPE (lhs);
 +  temp.op0 = lhs;
 +  temp.off = -1;
 +  VEC_safe_push (vn_reference_op_s, heap, *result, temp);
 +}
 
 this deserves a comment

Done.

 - you are adding the extra operand solely for
 the purpose of hashing.  You are also not doing a good job identifying
 common calls.  Consider
 
 if ()
  *p = foo ();
 else
  *q = foo ();
 
 where p and q are value-numbered the same.  You fail to properly
 commonize the blocks.  That is because valueization of the ops
 of the call does not work for arbitrarily complex operands - see
 how we handle call operands.  Instead you should probably use
 copy_reference_ops_from_ref on the lhs, similar to call operands.
 

If p and q are value numbered, it means they're SSA_NAMEs, and that means
they're not handled by this patch which is only about handling calls for which
the lhs is not an SSA_NAME.

This example is handled by the patch I posted for pr52009. I reposted the patch
and added this test-case 
(http://gcc.gnu.org/ml/gcc-patches/2012-07/msg00155.html).

So I'm not using copy_reference_ops_from_ref on the lhs, since it's not an 
SSA_NAME.

 Using MODIFY_EXPR as toplevel code for the vn_reference is going to
 indeed disable PRE for them, likewise any other call handling in VN.
 
 Otherwise the idea looks ok - can you change the patch like above
 and add a testcase with an equal-VNed indirect store?
 

I updated the patch as indicated in my comments, and added the test-case to the
patch for pr52009.

Bootstrapped and reg-tested on x86_64 (ada inclusive).

OK for trunk?

Thanks,
- Tom

2012-07-05  Tom de Vries  t...@codesourcery.com

* tree-ssa-sccvn.c (copy_reference_ops_from_call)
(visit_reference_op_call): Handle case that lhs is not an SSA_NAME.
(visit_use): Also call visit_reference_op_call for calls with a vdef.

* gcc.dg/pr51879-16.c: New test.
* gcc.dg/pr51879-17.c: Same.
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c (revision 189007)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -942,6 +942,20 @@ copy_reference_ops_from_call (gimple cal
 {
   vn_reference_op_s temp;
   unsigned i;
+  tree lhs = gimple_call_lhs (call);
+
+  /* If 2 calls have a different non-ssa lhs, vdef value numbers should be
+ different.  By adding the lhs here in the vector, we ensure that the
+

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-05-03 Thread Richard Guenther
On Wed, May 2, 2012 at 4:06 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27/04/12 11:01, Richard Guenther wrote:
 SNIP
 I see you do not handle
 SNIP
 struct S { int i; };
 struct S foo (void);
 struct S bar (void)
 {
   struct S s1, s2;
   if (...)
    s = foo ();
   else
    s = foo ();

 because the calls have a LHS that is not an SSA name.

 Indeed, the gvn patch handles this example conservatively, and 
 tree-tail-merge
 fails to optimize this test-case:
 ...
 struct S { int i; };
 extern struct S foo (void);
 extern int foo2 (void);
 struct S s;
 int bar (int c) {
  int r;
  if (c)
    {
      s = foo ();
      r = foo2 ();
    }
  else
    {
      s = foo ();
      r = foo2 ();
    }
  return r;
 }
 ...

 A todo.

 SNIP
 bootstrapped and reg-tested on x86_64 (ada inclusive).

 Is this patch ok, or is the todo required?

 No, you can followup with that.


 Richard,

 here is the follow-up patch, which adds value numbering of a call for which 
 the
 lhs is not an SSA_NAME.

 The only thing I ended up using from the patch in
 http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html was the idea of using
 MODIFY_EXPR.

 I don't include any handling of MODIFY_EXPR in 
 create_component_ref_by_pieces_1
 because I don't think it will trigger with PRE.

 bootstrapped and reg-tested on x86_64.

 Ok for trunk?

Hmm, I wonder why

  if (!gimple_call_internal_p (stmt)
   (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
  /* If the call has side effects, subsequent calls won't have
 the same incoming vuse, so it's save to assume
 equality.  */
  || gimple_has_side_effects (stmt)))

works - I realize you added the gimple_has_side_effects () call - but
if you consider ECF_LOOPING_CONST_OR_PURE functions, which
have no VDEF, then it's odd how the comment applies.  And together
both tests turn out to let all calls pass.

+  tree lhs = gimple_call_lhs (call);
+
+  if (lhs  TREE_CODE (lhs) != SSA_NAME)
+{
+  memset (temp, 0, sizeof (temp));
+  temp.opcode = MODIFY_EXPR;
+  temp.type = TREE_TYPE (lhs);
+  temp.op0 = lhs;
+  temp.off = -1;
+  VEC_safe_push (vn_reference_op_s, heap, *result, temp);
+}

this deserves a comment - you are adding the extra operand solely for
the purpose of hashing.  You are also not doing a good job identifying
common calls.  Consider

if ()
 *p = foo ();
else
 *q = foo ();

where p and q are value-numbered the same.  You fail to properly
commonize the blocks.  That is because valueization of the ops
of the call does not work for arbitrarily complex operands - see
how we handle call operands.  Instead you should probably use
copy_reference_ops_from_ref on the lhs, similar to call operands.

Using MODIFY_EXPR as toplevel code for the vn_reference is going to
indeed disable PRE for them, likewise any other call handling in VN.

Otherwise the idea looks ok - can you change the patch like above
and add a testcase with an equal-VNed indirect store?

Thanks,
Richard.

 Thanks,
 - Tom

 2012-05-02  Tom de Vries  t...@codesourcery.com

        * tree-ssa-sccvn.c (copy_reference_ops_from_call)
        (visit_reference_op_call): Handle case that lhs is not an SSA_NAME.
        (visit_use): Call visit_reference_op_call for calls with lhs that is 
 not
        an SSA_NAME.

        * gcc.dg/pr51879-16.c: New test.
        * gcc.dg/pr51879-17.c: Same.


Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-05-02 Thread Tom de Vries
On 27/04/12 11:01, Richard Guenther wrote:
SNIP
 I see you do not handle
SNIP
 struct S { int i; };
 struct S foo (void);
 struct S bar (void)
 {
   struct S s1, s2;
   if (...)
s = foo ();
   else
s = foo ();

 because the calls have a LHS that is not an SSA name.

 Indeed, the gvn patch handles this example conservatively, and 
 tree-tail-merge
 fails to optimize this test-case:
 ...
 struct S { int i; };
 extern struct S foo (void);
 extern int foo2 (void);
 struct S s;
 int bar (int c) {
  int r;
  if (c)
{
  s = foo ();
  r = foo2 ();
}
  else
{
  s = foo ();
  r = foo2 ();
}
  return r;
 }
 ...

 A todo.

SNIP
 bootstrapped and reg-tested on x86_64 (ada inclusive).

 Is this patch ok, or is the todo required?

 No, you can followup with that.


Richard,

here is the follow-up patch, which adds value numbering of a call for which the
lhs is not an SSA_NAME.

The only thing I ended up using from the patch in
http://gcc.gnu.org/ml/gcc-patches/2012-01/msg01731.html was the idea of using
MODIFY_EXPR.

I don't include any handling of MODIFY_EXPR in create_component_ref_by_pieces_1
because I don't think it will trigger with PRE.

bootstrapped and reg-tested on x86_64.

Ok for trunk?

Thanks,
- Tom

2012-05-02  Tom de Vries  t...@codesourcery.com

* tree-ssa-sccvn.c (copy_reference_ops_from_call)
(visit_reference_op_call): Handle case that lhs is not an SSA_NAME.
(visit_use): Call visit_reference_op_call for calls with lhs that is not
an SSA_NAME.

* gcc.dg/pr51879-16.c: New test.
* gcc.dg/pr51879-17.c: Same.
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c (revision 186907)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -942,6 +947,17 @@ copy_reference_ops_from_call (gimple cal
 {
   vn_reference_op_s temp;
   unsigned i;
+  tree lhs = gimple_call_lhs (call);
+
+  if (lhs  TREE_CODE (lhs) != SSA_NAME)
+{
+  memset (temp, 0, sizeof (temp));
+  temp.opcode = MODIFY_EXPR;
+  temp.type = TREE_TYPE (lhs);
+  temp.op0 = lhs;
+  temp.off = -1;
+  VEC_safe_push (vn_reference_op_s, heap, *result, temp);
+}
 
   /* Copy the type, opcode, function being called and static chain.  */
   memset (temp, 0, sizeof (temp));
@@ -2624,6 +2641,9 @@ visit_reference_op_call (tree lhs, gimpl
   tree vuse = gimple_vuse (stmt);
   tree vdef = gimple_vdef (stmt);
 
+  if (lhs  TREE_CODE (lhs) != SSA_NAME)
+lhs = NULL_TREE;
+
   vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE;
   vr1.operands = valueize_shared_reference_ops_from_call (stmt);
   vr1.type = gimple_expr_type (stmt);
@@ -3410,9 +3432,7 @@ visit_use (tree use)
 		  /* If the call has side effects, subsequent calls won't have
 		 the same incoming vuse, so it's save to assume
 		 equality.  */
-		  || gimple_has_side_effects (stmt))
-	   ((lhs  TREE_CODE (lhs) == SSA_NAME)
-		  || (!lhs  gimple_vdef (stmt
+		  || gimple_has_side_effects (stmt)))
 	{
 	  changed = visit_reference_op_call (lhs, stmt);
 	}
Index: gcc/testsuite/gcc.dg/pr51879-16.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-16.c (revision 0)
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-pre } */
+
+struct S {
+  int i;
+};
+
+extern struct S foo (void);
+extern int foo2 (void);
+
+struct S s;
+
+int bar (int c) {
+  int r;
+
+  if (c)
+{
+  s = foo ();
+  r = foo2 ();
+}
+  else
+{
+  s = foo ();
+  r = foo2 ();
+}
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump-times foo \\( 1 pre} } */
+/* { dg-final { scan-tree-dump-times foo2 \\( 1 pre} } */
+/* { dg-final { cleanup-tree-dump pre } } */
Index: gcc/testsuite/gcc.dg/pr51879-17.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-17.c (revision 0)
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-pre } */
+
+struct S {
+  int i;
+};
+
+extern struct S foo (void);
+extern int foo2 (void);
+
+struct S s, s2;
+
+int bar (int c) {
+  int r;
+
+  if (c)
+{
+  s = foo ();
+  r = foo2 ();
+}
+  else
+{
+  s2 = foo ();
+  r = foo2 ();
+}
+
+  return r;
+}
+
+/* { dg-final { scan-tree-dump-times foo \\( 2 pre} } */
+/* { dg-final { scan-tree-dump-times foo2 \\( 2 pre} } */
+/* { dg-final { cleanup-tree-dump pre } } */


Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-27 Thread Tom de Vries
On 26/04/12 12:20, Richard Guenther wrote:
 On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 25/04/12 11:57, Richard Guenther wrote:
 SNIP
 Hmm.  I'm not sure we can conclude that they have the same value!

 +int bar (int);
 +void baz (int);
 +void bla (int);
 +
 +void
 +foo (int y)
 +{
 +  int a;
 +  if (y == 6)
 +{
 +  bla (5);
 +  a = bar (7);
 +}
 +  else
 +{
 +  bla (5);
 +  a = bar (7);
 +}
 +  baz (a);
 +}

 I can implement bla as

 void bla(int) { a = random (); }

 thus a would not have the same value on both paths

 I think it's hard to decide whether they have the same value or not, 
 since we
 cannot write a test-case that compares the results of calls from 2 
 branches of
 an if statement.
 void *foo ()
 {
   return __builtin_return_address (0);
 }

 void *bar (_Bool b)
 {
   if (b)
 return foo ();
   else
 return foo ();
 }

 int main()
 {
   if (bar(true) == bar(false))
 abort ();
 }

 ok ... outside of the scope of standard C, but we certainly _can_ do this.
 Which would question tail-merging the above at all, of course.


 I added noinline to make sure there's no variation from inlining.
 ...
 extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
 ((__noreturn__));

 __attribute__((noinline))
 void *foo ()
 {
  return __builtin_return_address (0);
 }

 __attribute__((noinline))
 void *bar (int b)
 {
  if (b)
return foo ();
  else
return foo ();
 }

 int main()
 {
  void *a, *b;
  a = bar (1);
  b = bar (0);
  if (a == b)
abort ();
  return 0;
 }
 ...

 This test-case passes with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fno-crossjumping
 ...

 and fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fcrossjumping
 ...

 with the proposed patch, it also fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge 
 -fno-crossjumping
 ...

 Tree-tail-merge performs the same transformation as cross-jumping for this
 test-case. I think that the transformation is valid, and that the test-case
 behaves as expected. Subsequent calls to foo may or may not return the same
 value, depending on the optimizations, we can't assert a != b, that's just 
 the
 nature of __builtin_return_address.

 Btw, this example is not what I meant when I said 'a test-case that compares 
 the
 results of calls from 2 branches of an if statement'. I tried to say that the
 semantics of C is defined in terms of taking either 1 branch or the other,
 rather than executing both in parallel, after which both results would be
 available. From that point of view, this example compares subsequent calls to
 foo, not parallel.
 
 Ah, I see.  Btw, I was not suggesting that we should not optimize the above.
 
 I started looking at the problem in terms of subsequent calls. Consider 
 the
 imaginary attribute nosideeffect, similar to const and pure.

 const:
 - no effects except the return value
 - the return value depends only on the parameters

 pure:
 - no effects except the return value
 - the return value depends only on the parameters and/or global variables

 nosideeffect:
 - no effects except the return value

 The code fragment:
 ...
 extern int f (int) __attribute ((nosideeffect));
 int g()
 {
  int a;
  int b;
  a = f (5);
  b = f (5);
  return a + b;
 }
 ...

 would result in a gimple sequence more or less like this:
 ...
  # VUSE .MEMD.1712_4(D)
  aD.1707_1 = fD.1704 (5);
  # VUSE .MEMD.1712_4(D)
  bD.1708_2 = fD.1704 (5);
  D.1710_3 = aD.1707_1 + bD.1708_2;
  # VUSE .MEMD.1712_6
  return D.1710_3;
 ...

 The value numbering modification I'm proposing would say that a and b 
 have the
 same value, which is not true. I updated the patch to ensure in this case 
 that a
 and b would be seen as different.
 Note that things won't break if the attribute is missing on a function, 
 since
 that just means that the function would have a vdef, and there would be 
 no 2
 subsequent function calls with the same vuse.

 - but that is not what
 matters - because obviously we can still merge the blocks.

 Right.

  That is,
 we cannot be sure that the side-effects on memory of bla (5) and bla (5)
 are the same.  But is that not what your value-numbering changes will 
 assert?

 In the case of calls in different branches of an control flow statement, 
 we can
 aggressively assume they're the same, since we cannot write a check that 
 would
 start failing.
 Apart from the above ... which shows that even the returned values can 
 matter.

 In the case of subsequent calls with side effects, the vuses will never 
 be the same.

 In the case of subsequent calls without side effects, but not pure or 
 const, we
 can't assume they have the same result. AFAIK, there's currently no way to
 specify such a function, but the updated patch should be able handle this.

 (not sure how you achieve this, but it looks like you insert/lookup 
 general
 calls, thus not 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-27 Thread Richard Guenther
On Fri, Apr 27, 2012 at 8:20 AM, Tom de Vries tom_devr...@mentor.com wrote:
 On 26/04/12 12:20, Richard Guenther wrote:
 On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 On 25/04/12 11:57, Richard Guenther wrote:
 SNIP
 Hmm.  I'm not sure we can conclude that they have the same value!

 +int bar (int);
 +void baz (int);
 +void bla (int);
 +
 +void
 +foo (int y)
 +{
 +  int a;
 +  if (y == 6)
 +    {
 +      bla (5);
 +      a = bar (7);
 +    }
 +  else
 +    {
 +      bla (5);
 +      a = bar (7);
 +    }
 +  baz (a);
 +}

 I can implement bla as

 void bla(int) { a = random (); }

 thus a would not have the same value on both paths

 I think it's hard to decide whether they have the same value or not, 
 since we
 cannot write a test-case that compares the results of calls from 2 
 branches of
 an if statement.
 void *foo ()
 {
   return __builtin_return_address (0);
 }

 void *bar (_Bool b)
 {
   if (b)
     return foo ();
   else
     return foo ();
 }

 int main()
 {
   if (bar(true) == bar(false))
     abort ();
 }

 ok ... outside of the scope of standard C, but we certainly _can_ do 
 this.
 Which would question tail-merging the above at all, of course.


 I added noinline to make sure there's no variation from inlining.
 ...
 extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
 ((__noreturn__));

 __attribute__((noinline))
 void *foo ()
 {
  return __builtin_return_address (0);
 }

 __attribute__((noinline))
 void *bar (int b)
 {
  if (b)
    return foo ();
  else
    return foo ();
 }

 int main()
 {
  void *a, *b;
  a = bar (1);
  b = bar (0);
  if (a == b)
    abort ();
  return 0;
 }
 ...

 This test-case passes with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fno-crossjumping
 ...

 and fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fcrossjumping
 ...

 with the proposed patch, it also fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge 
 -fno-crossjumping
 ...

 Tree-tail-merge performs the same transformation as cross-jumping for this
 test-case. I think that the transformation is valid, and that the test-case
 behaves as expected. Subsequent calls to foo may or may not return the same
 value, depending on the optimizations, we can't assert a != b, that's just 
 the
 nature of __builtin_return_address.

 Btw, this example is not what I meant when I said 'a test-case that 
 compares the
 results of calls from 2 branches of an if statement'. I tried to say that 
 the
 semantics of C is defined in terms of taking either 1 branch or the other,
 rather than executing both in parallel, after which both results would be
 available. From that point of view, this example compares subsequent calls 
 to
 foo, not parallel.

 Ah, I see.  Btw, I was not suggesting that we should not optimize the above.

 I started looking at the problem in terms of subsequent calls. Consider 
 the
 imaginary attribute nosideeffect, similar to const and pure.

 const:
 - no effects except the return value
 - the return value depends only on the parameters

 pure:
 - no effects except the return value
 - the return value depends only on the parameters and/or global variables

 nosideeffect:
 - no effects except the return value

 The code fragment:
 ...
 extern int f (int) __attribute ((nosideeffect));
 int g()
 {
  int a;
  int b;
  a = f (5);
  b = f (5);
  return a + b;
 }
 ...

 would result in a gimple sequence more or less like this:
 ...
  # VUSE .MEMD.1712_4(D)
  aD.1707_1 = fD.1704 (5);
  # VUSE .MEMD.1712_4(D)
  bD.1708_2 = fD.1704 (5);
  D.1710_3 = aD.1707_1 + bD.1708_2;
  # VUSE .MEMD.1712_6
  return D.1710_3;
 ...

 The value numbering modification I'm proposing would say that a and b 
 have the
 same value, which is not true. I updated the patch to ensure in this 
 case that a
 and b would be seen as different.
 Note that things won't break if the attribute is missing on a function, 
 since
 that just means that the function would have a vdef, and there would be 
 no 2
 subsequent function calls with the same vuse.

 - but that is not what
 matters - because obviously we can still merge the blocks.

 Right.

  That is,
 we cannot be sure that the side-effects on memory of bla (5) and bla 
 (5)
 are the same.  But is that not what your value-numbering changes will 
 assert?

 In the case of calls in different branches of an control flow statement, 
 we can
 aggressively assume they're the same, since we cannot write a check that 
 would
 start failing.
 Apart from the above ... which shows that even the returned values can 
 matter.

 In the case of subsequent calls with side effects, the vuses will never 
 be the same.

 In the case of subsequent calls without side effects, but not pure or 
 const, we
 can't assume they have the same result. AFAIK, there's currently no way 
 to
 specify such a function, but the updated patch should be able handle 
 this.

 (not 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-26 Thread Richard Guenther
On Wed, Apr 25, 2012 at 11:56 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 25/04/12 11:57, Richard Guenther wrote:
 SNIP
  Hmm.  I'm not sure we can conclude that they have the same value!
 
  +int bar (int);
  +void baz (int);
  +void bla (int);
  +
  +void
  +foo (int y)
  +{
  +  int a;
  +  if (y == 6)
  +    {
  +      bla (5);
  +      a = bar (7);
  +    }
  +  else
  +    {
  +      bla (5);
  +      a = bar (7);
  +    }
  +  baz (a);
  +}
 
  I can implement bla as
 
  void bla(int) { a = random (); }
 
  thus a would not have the same value on both paths
 
  I think it's hard to decide whether they have the same value or not, 
  since we
  cannot write a test-case that compares the results of calls from 2 
  branches of
  an if statement.
 void *foo ()
 {
   return __builtin_return_address (0);
 }

 void *bar (_Bool b)
 {
   if (b)
     return foo ();
   else
     return foo ();
 }

 int main()
 {
   if (bar(true) == bar(false))
     abort ();
 }

 ok ... outside of the scope of standard C, but we certainly _can_ do this.
 Which would question tail-merging the above at all, of course.


 I added noinline to make sure there's no variation from inlining.
 ...
 extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
 ((__noreturn__));

 __attribute__((noinline))
 void *foo ()
 {
  return __builtin_return_address (0);
 }

 __attribute__((noinline))
 void *bar (int b)
 {
  if (b)
    return foo ();
  else
    return foo ();
 }

 int main()
 {
  void *a, *b;
  a = bar (1);
  b = bar (0);
  if (a == b)
    abort ();
  return 0;
 }
 ...

 This test-case passes with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fno-crossjumping
 ...

 and fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
 -fcrossjumping
 ...

 with the proposed patch, it also fails with:
 ...
 gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge 
 -fno-crossjumping
 ...

 Tree-tail-merge performs the same transformation as cross-jumping for this
 test-case. I think that the transformation is valid, and that the test-case
 behaves as expected. Subsequent calls to foo may or may not return the same
 value, depending on the optimizations, we can't assert a != b, that's just the
 nature of __builtin_return_address.

 Btw, this example is not what I meant when I said 'a test-case that compares 
 the
 results of calls from 2 branches of an if statement'. I tried to say that the
 semantics of C is defined in terms of taking either 1 branch or the other,
 rather than executing both in parallel, after which both results would be
 available. From that point of view, this example compares subsequent calls to
 foo, not parallel.

Ah, I see.  Btw, I was not suggesting that we should not optimize the above.

  I started looking at the problem in terms of subsequent calls. Consider 
  the
  imaginary attribute nosideeffect, similar to const and pure.
 
  const:
  - no effects except the return value
  - the return value depends only on the parameters
 
  pure:
  - no effects except the return value
  - the return value depends only on the parameters and/or global variables
 
  nosideeffect:
  - no effects except the return value
 
  The code fragment:
  ...
  extern int f (int) __attribute ((nosideeffect));
  int g()
  {
   int a;
   int b;
   a = f (5);
   b = f (5);
   return a + b;
  }
  ...
 
  would result in a gimple sequence more or less like this:
  ...
   # VUSE .MEMD.1712_4(D)
   aD.1707_1 = fD.1704 (5);
   # VUSE .MEMD.1712_4(D)
   bD.1708_2 = fD.1704 (5);
   D.1710_3 = aD.1707_1 + bD.1708_2;
   # VUSE .MEMD.1712_6
   return D.1710_3;
  ...
 
  The value numbering modification I'm proposing would say that a and b 
  have the
  same value, which is not true. I updated the patch to ensure in this case 
  that a
  and b would be seen as different.
  Note that things won't break if the attribute is missing on a function, 
  since
  that just means that the function would have a vdef, and there would be 
  no 2
  subsequent function calls with the same vuse.
 
  - but that is not what
  matters - because obviously we can still merge the blocks.
 
  Right.
 
   That is,
  we cannot be sure that the side-effects on memory of bla (5) and bla (5)
  are the same.  But is that not what your value-numbering changes will 
  assert?
 
  In the case of calls in different branches of an control flow statement, 
  we can
  aggressively assume they're the same, since we cannot write a check that 
  would
  start failing.
 Apart from the above ... which shows that even the returned values can 
 matter.

  In the case of subsequent calls with side effects, the vuses will never 
  be the same.
 
  In the case of subsequent calls without side effects, but not pure or 
  const, we
  can't assume they have the same result. AFAIK, there's currently no way to
  specify such a function, but the updated patch should be able handle this.
 
  (not sure how you achieve this, but it 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-25 Thread Richard Guenther
On Tue, Apr 24, 2012 at 11:19 PM, Tom de Vries tom_devr...@mentor.com wrote:
 On 17/04/12 14:24, Richard Guenther wrote:
 On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27/01/12 21:37, Tom de Vries wrote:
 On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
    a = bar (7);
  else
    a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

 The following cannot really work:

 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
    result = vn_reference_lookup_1 (vr1, NULL);
    if (result)
      {
 -      changed = set_ssa_val_to (lhs, result);
 +      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +      if (vdef)
 +       changed |= set_ssa_val_to (vdef, result_vdef);
 +      changed |= set_ssa_val_to (lhs, result);

 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.


 Right. And that also doesn't work if the function is without lhs, such as 
 in the
 new test-case pr51879-6.c.

 I fixed this by storing both lhs and vdef, such that I don't have to derive
 the vdef from the lhs.

 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)


 Doing so willl hurt performance of tail-merging in its current form.
 OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
 value numbering as used in tail-merging has its limitations too.
 Do you have any ideas how to address that one?

 @@ -3359,8 +3366,10 @@ visit_use (tree use)
           /* ???  We should handle stores from calls.  */
           else if (TREE_CODE (lhs) == SSA_NAME)
             {
 +             tree vuse = gimple_vuse (stmt);
               if (!gimple_call_internal_p (stmt)
 -                  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +                  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 +                     || (vuse  SSA_VAL (vuse) != VN_TOP)))
                 changed = visit_reference_op_call (lhs, stmt);
               else
                 changed = defs_to_varying (stmt);

 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?


 Removed now, that was a workaround for a bug in an earlier version of the 
 patch,
 that I didn't need anymore.

 Bootstrapped and reg-tested on x86_64.

 OK for stage1?


 Richard,

 quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
 ...
 I think these fixes hint at that we should
 use structural equality as fallback if value-numbering doesn't equate
 two stmt effects.  Thus, treat two stmts with exactly the same operands
 and flags as equal and using value-numbering to canonicalize operands
 (when they are SSA names) for that comparison, or use VN entirely
 if there are no side-effects on the stmt.

 Changing value-numbering of virtual operands, even if it looks correct in 
 the
 simple cases you change, doesn't look like a general solution for the missed
 tail merging opportunities.
 ...

 The test-case pr51879-6.c shows a case where improving value numbering will 
 help
 tail-merging, but structural equality comparison not:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF .MEMD.1717_7
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-25 Thread Jakub Jelinek
On Wed, Apr 25, 2012 at 11:57:09AM +0200, Richard Guenther wrote:
 void *foo ()
 {
   return __builtin_return_address (0);
 }
 
 void *bar (_Bool b)
 {
   if (b)
 return foo ();
   else
 return foo ();
 }
 
 int main()
 {
   if (bar(true) == bar(false))
 abort ();
 }
 
 ok ... outside of the scope of standard C, but we certainly _can_ do this.
 Which would question tail-merging the above at all, of course.

I don't think we guarantee the above, after all, even pure functions may
use __builtin_return_address (0) - it doesn't modify memory, and we happily
remove pure calls, CSE the return values etc.

Jakub


Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-25 Thread Tom de Vries
On 25/04/12 12:09, Jakub Jelinek wrote:
 On Wed, Apr 25, 2012 at 11:57:09AM +0200, Richard Guenther wrote:
 void *foo ()
 {
   return __builtin_return_address (0);
 }

 void *bar (_Bool b)
 {
   if (b)
 return foo ();
   else
 return foo ();
 }

 int main()
 {
   if (bar(true) == bar(false))
 abort ();
 }

 ok ... outside of the scope of standard C, but we certainly _can_ do this.
 Which would question tail-merging the above at all, of course.
 
 I don't think we guarantee the above, after all, even pure functions may
 use __builtin_return_address (0) - it doesn't modify memory, and we happily
 remove pure calls, CSE the return values etc.
 

Jakub,

pure:
- no effects except the return value
- return value depends only on the parameters and/or global variables

AFAIU, given this definition, a pure function cannot use
__builtin_return_address since it would mean that the return value depends on
something else than parameters and global variables.

Thanks,
- Tom

   Jakub



Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-25 Thread Tom de Vries
On 25/04/12 11:57, Richard Guenther wrote:
SNIP
  Hmm.  I'm not sure we can conclude that they have the same value!
 
  +int bar (int);
  +void baz (int);
  +void bla (int);
  +
  +void
  +foo (int y)
  +{
  +  int a;
  +  if (y == 6)
  +{
  +  bla (5);
  +  a = bar (7);
  +}
  +  else
  +{
  +  bla (5);
  +  a = bar (7);
  +}
  +  baz (a);
  +}
 
  I can implement bla as
 
  void bla(int) { a = random (); }
 
  thus a would not have the same value on both paths
 
  I think it's hard to decide whether they have the same value or not, since 
  we
  cannot write a test-case that compares the results of calls from 2 
  branches of
  an if statement.
 void *foo ()
 {
   return __builtin_return_address (0);
 }
 
 void *bar (_Bool b)
 {
   if (b)
 return foo ();
   else
 return foo ();
 }
 
 int main()
 {
   if (bar(true) == bar(false))
 abort ();
 }
 
 ok ... outside of the scope of standard C, but we certainly _can_ do this.
 Which would question tail-merging the above at all, of course.
 

I added noinline to make sure there's no variation from inlining.
...
extern void abort (void) __attribute__ ((__nothrow__)) __attribute__
((__noreturn__));

__attribute__((noinline))
void *foo ()
{
  return __builtin_return_address (0);
}

__attribute__((noinline))
void *bar (int b)
{
  if (b)
return foo ();
  else
return foo ();
}

int main()
{
  void *a, *b;
  a = bar (1);
  b = bar (0);
  if (a == b)
abort ();
  return 0;
}
...

This test-case passes with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge 
-fno-crossjumping
...

and fails with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -fno-tree-tail-merge -fcrossjumping
...

with the proposed patch, it also fails with:
...
gcc -O2 calls.c -fno-optimize-sibling-calls -ftree-tail-merge -fno-crossjumping
...

Tree-tail-merge performs the same transformation as cross-jumping for this
test-case. I think that the transformation is valid, and that the test-case
behaves as expected. Subsequent calls to foo may or may not return the same
value, depending on the optimizations, we can't assert a != b, that's just the
nature of __builtin_return_address.

Btw, this example is not what I meant when I said 'a test-case that compares the
results of calls from 2 branches of an if statement'. I tried to say that the
semantics of C is defined in terms of taking either 1 branch or the other,
rather than executing both in parallel, after which both results would be
available. From that point of view, this example compares subsequent calls to
foo, not parallel.

  I started looking at the problem in terms of subsequent calls. Consider the
  imaginary attribute nosideeffect, similar to const and pure.
 
  const:
  - no effects except the return value
  - the return value depends only on the parameters
 
  pure:
  - no effects except the return value
  - the return value depends only on the parameters and/or global variables
 
  nosideeffect:
  - no effects except the return value
 
  The code fragment:
  ...
  extern int f (int) __attribute ((nosideeffect));
  int g()
  {
   int a;
   int b;
   a = f (5);
   b = f (5);
   return a + b;
  }
  ...
 
  would result in a gimple sequence more or less like this:
  ...
   # VUSE .MEMD.1712_4(D)
   aD.1707_1 = fD.1704 (5);
   # VUSE .MEMD.1712_4(D)
   bD.1708_2 = fD.1704 (5);
   D.1710_3 = aD.1707_1 + bD.1708_2;
   # VUSE .MEMD.1712_6
   return D.1710_3;
  ...
 
  The value numbering modification I'm proposing would say that a and b have 
  the
  same value, which is not true. I updated the patch to ensure in this case 
  that a
  and b would be seen as different.
  Note that things won't break if the attribute is missing on a function, 
  since
  that just means that the function would have a vdef, and there would be no 
  2
  subsequent function calls with the same vuse.
 
  - but that is not what
  matters - because obviously we can still merge the blocks.
 
  Right.
 
   That is,
  we cannot be sure that the side-effects on memory of bla (5) and bla (5)
  are the same.  But is that not what your value-numbering changes will 
  assert?
 
  In the case of calls in different branches of an control flow statement, 
  we can
  aggressively assume they're the same, since we cannot write a check that 
  would
  start failing.
 Apart from the above ... which shows that even the returned values can matter.
 
  In the case of subsequent calls with side effects, the vuses will never be 
  the same.
 
  In the case of subsequent calls without side effects, but not pure or 
  const, we
  can't assume they have the same result. AFAIK, there's currently no way to
  specify such a function, but the updated patch should be able handle this.
 
  (not sure how you achieve this, but it looks like you insert/lookup 
  general
  calls, thus not pure/const ones
 
  Correct.
 
  - that could easily lead to miscompiles(?)).
 
 And to increased compile-time (not sure if it matters).
 
  I 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-24 Thread Tom de Vries
On 17/04/12 14:24, Richard Guenther wrote:
 On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27/01/12 21:37, Tom de Vries wrote:
 On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
a = bar (7);
  else
a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

 The following cannot really work:

 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
result = vn_reference_lookup_1 (vr1, NULL);
if (result)
  {
 -  changed = set_ssa_val_to (lhs, result);
 +  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +  if (vdef)
 +   changed |= set_ssa_val_to (vdef, result_vdef);
 +  changed |= set_ssa_val_to (lhs, result);

 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.


 Right. And that also doesn't work if the function is without lhs, such as 
 in the
 new test-case pr51879-6.c.

 I fixed this by storing both lhs and vdef, such that I don't have to derive
 the vdef from the lhs.

 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)


 Doing so willl hurt performance of tail-merging in its current form.
 OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
 value numbering as used in tail-merging has its limitations too.
 Do you have any ideas how to address that one?

 @@ -3359,8 +3366,10 @@ visit_use (tree use)
   /* ???  We should handle stores from calls.  */
   else if (TREE_CODE (lhs) == SSA_NAME)
 {
 + tree vuse = gimple_vuse (stmt);
   if (!gimple_call_internal_p (stmt)
 -  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 + || (vuse  SSA_VAL (vuse) != VN_TOP)))
 changed = visit_reference_op_call (lhs, stmt);
   else
 changed = defs_to_varying (stmt);

 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?


 Removed now, that was a workaround for a bug in an earlier version of the 
 patch,
 that I didn't need anymore.

 Bootstrapped and reg-tested on x86_64.

 OK for stage1?


 Richard,

 quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
 ...
 I think these fixes hint at that we should
 use structural equality as fallback if value-numbering doesn't equate
 two stmt effects.  Thus, treat two stmts with exactly the same operands
 and flags as equal and using value-numbering to canonicalize operands
 (when they are SSA names) for that comparison, or use VN entirely
 if there are no side-effects on the stmt.

 Changing value-numbering of virtual operands, even if it looks correct in the
 simple cases you change, doesn't look like a general solution for the missed
 tail merging opportunities.
 ...

 The test-case pr51879-6.c shows a case where improving value numbering will 
 help
 tail-merging, but structural equality comparison not:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF .MEMD.1717_7
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1717_9 = VDEF .MEMD.1717_6(D)
  # USE = 

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-17 Thread Richard Guenther
On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries tom_devr...@mentor.com wrote:
 On 27/01/12 21:37, Tom de Vries wrote:
 On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
    a = bar (7);
  else
    a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

 The following cannot really work:

 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
    result = vn_reference_lookup_1 (vr1, NULL);
    if (result)
      {
 -      changed = set_ssa_val_to (lhs, result);
 +      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +      if (vdef)
 +       changed |= set_ssa_val_to (vdef, result_vdef);
 +      changed |= set_ssa_val_to (lhs, result);

 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.


 Right. And that also doesn't work if the function is without lhs, such as in 
 the
 new test-case pr51879-6.c.

 I fixed this by storing both lhs and vdef, such that I don't have to derive
 the vdef from the lhs.

 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)


 Doing so willl hurt performance of tail-merging in its current form.
 OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
 value numbering as used in tail-merging has its limitations too.
 Do you have any ideas how to address that one?

 @@ -3359,8 +3366,10 @@ visit_use (tree use)
           /* ???  We should handle stores from calls.  */
           else if (TREE_CODE (lhs) == SSA_NAME)
             {
 +             tree vuse = gimple_vuse (stmt);
               if (!gimple_call_internal_p (stmt)
 -                  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +                  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 +                     || (vuse  SSA_VAL (vuse) != VN_TOP)))
                 changed = visit_reference_op_call (lhs, stmt);
               else
                 changed = defs_to_varying (stmt);

 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?


 Removed now, that was a workaround for a bug in an earlier version of the 
 patch,
 that I didn't need anymore.

 Bootstrapped and reg-tested on x86_64.

 OK for stage1?


 Richard,

 quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
 ...
 I think these fixes hint at that we should
 use structural equality as fallback if value-numbering doesn't equate
 two stmt effects.  Thus, treat two stmts with exactly the same operands
 and flags as equal and using value-numbering to canonicalize operands
 (when they are SSA names) for that comparison, or use VN entirely
 if there are no side-effects on the stmt.

 Changing value-numbering of virtual operands, even if it looks correct in the
 simple cases you change, doesn't look like a general solution for the missed
 tail merging opportunities.
 ...

 The test-case pr51879-6.c shows a case where improving value numbering will 
 help
 tail-merging, but structural equality comparison not:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF .MEMD.1717_7
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1717_9 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-04-14 Thread Tom de Vries
On 27/01/12 21:37, Tom de Vries wrote:
 On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com 
 wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
a = bar (7);
  else
a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

 The following cannot really work:

 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
result = vn_reference_lookup_1 (vr1, NULL);
if (result)
  {
 -  changed = set_ssa_val_to (lhs, result);
 +  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +  if (vdef)
 +   changed |= set_ssa_val_to (vdef, result_vdef);
 +  changed |= set_ssa_val_to (lhs, result);

 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.

 
 Right. And that also doesn't work if the function is without lhs, such as in 
 the
 new test-case pr51879-6.c.
 
 I fixed this by storing both lhs and vdef, such that I don't have to derive
 the vdef from the lhs.
 
 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)

 
 Doing so willl hurt performance of tail-merging in its current form.
 OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
 value numbering as used in tail-merging has its limitations too.
 Do you have any ideas how to address that one?
 
 @@ -3359,8 +3366,10 @@ visit_use (tree use)
   /* ???  We should handle stores from calls.  */
   else if (TREE_CODE (lhs) == SSA_NAME)
 {
 + tree vuse = gimple_vuse (stmt);
   if (!gimple_call_internal_p (stmt)
 -  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 + || (vuse  SSA_VAL (vuse) != VN_TOP)))
 changed = visit_reference_op_call (lhs, stmt);
   else
 changed = defs_to_varying (stmt);

 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?

 
 Removed now, that was a workaround for a bug in an earlier version of the 
 patch,
 that I didn't need anymore.
 
 Bootstrapped and reg-tested on x86_64.
 
 OK for stage1?
 

Richard,

quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
...
I think these fixes hint at that we should
use structural equality as fallback if value-numbering doesn't equate
two stmt effects.  Thus, treat two stmts with exactly the same operands
and flags as equal and using value-numbering to canonicalize operands
(when they are SSA names) for that comparison, or use VN entirely
if there are no side-effects on the stmt.

Changing value-numbering of virtual operands, even if it looks correct in the
simple cases you change, doesn't look like a general solution for the missed
tail merging opportunities.
...

The test-case pr51879-6.c shows a case where improving value numbering will help
tail-merging, but structural equality comparison not:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1717_7 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_8 = VDEF .MEMD.1717_7
  # USE = nonlocal
  # CLB = nonlocal
  aD.1712_3 = barD.1704 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1717_9 = VDEF .MEMD.1717_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  blaD.1708 (5);
  # .MEMD.1717_10 = VDEF .MEMD.1717_9
  # USE = nonlocal
  # CLB = nonlocal
  

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-01-27 Thread Tom de Vries
On 24/01/12 11:40, Richard Guenther wrote:
 On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
a = bar (7);
  else
a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
 .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?
 
 The following cannot really work:
 
 @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
result = vn_reference_lookup_1 (vr1, NULL);
if (result)
  {
 -  changed = set_ssa_val_to (lhs, result);
 +  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
 +  if (vdef)
 +   changed |= set_ssa_val_to (vdef, result_vdef);
 +  changed |= set_ssa_val_to (lhs, result);
 
 because 'result' may be not an SSA name.  It might also not have
 a proper definition statement (if VN_INFO (result)-needs_insertion
 is true).  So you at least need to guard things properly.
 

Right. And that also doesn't work if the function is without lhs, such as in the
new test-case pr51879-6.c.

I fixed this by storing both lhs and vdef, such that I don't have to derive
the vdef from the lhs.

 (On a side-note - I _did_ want to remove value-numbering virtual operands
 at some point ...)
 

Doing so willl hurt performance of tail-merging in its current form.
OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
value numbering as used in tail-merging has its limitations too.
Do you have any ideas how to address that one?

 @@ -3359,8 +3366,10 @@ visit_use (tree use)
   /* ???  We should handle stores from calls.  */
   else if (TREE_CODE (lhs) == SSA_NAME)
 {
 + tree vuse = gimple_vuse (stmt);
   if (!gimple_call_internal_p (stmt)
 -  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
 +  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
 + || (vuse  SSA_VAL (vuse) != VN_TOP)))
 changed = visit_reference_op_call (lhs, stmt);
   else
 changed = defs_to_varying (stmt);
 
 ... exactly because of the issue that a stmt has multiple defs.  Btw,
 vuse should have been visited here or be part of our SCC, so, why do
 you need this check?
 

Removed now, that was a workaround for a bug in an earlier version of the patch,
that I didn't need anymore.

Bootstrapped and reg-tested on x86_64.

OK for stage1?

Thanks,
- Tom

 Thanks,
 Richard.
 

2012-01-27  Tom de Vries  t...@codesourcery.com

PR tree-optimization/51879
* tree-ssa-sccvn.h (struct vn_reference_s): Add vdef field.
* tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
Handle case that lhs is NULL_TREE.
(visit_use): Handle non-pure/const calls and calls without result using
visit_reference_op_call.

gcc.dg/pr51879.c: New test.
gcc.dg/pr51879-2.c: Same.
gcc.dg/pr51879-3.c: Same.
gcc.dg/pr51879-4.c: Same.
gcc.dg/pr51879-6.c: Same.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c (revision 183325)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -2589,27 +2589,44 @@ visit_reference_op_call (tree lhs, gimpl
 {
   bool changed = false;
   struct vn_reference_s vr1;
-  tree result;
+  vn_reference_t vnresult = NULL;
   tree vuse = gimple_vuse (stmt);
+  tree vdef = gimple_vdef (stmt);
+
+  if (vdef)
+VN_INFO (vdef)-use_processed = true;
 
   vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE;
   vr1.operands = valueize_shared_reference_ops_from_call (stmt);
   vr1.type = gimple_expr_type (stmt);
   vr1.set = 0;
   vr1.hashcode = vn_reference_compute_hash (vr1);
-  result = vn_reference_lookup_1 (vr1, NULL);
-  if (result)
+  

Re: [PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-01-24 Thread Richard Guenther
On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries tom_devr...@mentor.com wrote:
 Richard,
 Jakub,

 the following patch fixes PR51879.

 Consider the following test-case:
 ...
 int bar (int);
 void baz (int);

 void
 foo (int y)
 {
  int a;
  if (y == 6)
    a = bar (7);
  else
    a = bar (7);
  baz (a);
 }
 ...

 after compiling at -02, the representation looks like this before 
 tail-merging:
 ...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
 ...

 the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
 to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

 The patch makes sure non-const/pure call results (gimple_vdef and
 gimple_call_lhs) are properly value numbered.

 Bootstrapped and reg-tested on x86_64.

 ok for stage1?

The following cannot really work:

@@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
   result = vn_reference_lookup_1 (vr1, NULL);
   if (result)
 {
-  changed = set_ssa_val_to (lhs, result);
+  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
+  if (vdef)
+   changed |= set_ssa_val_to (vdef, result_vdef);
+  changed |= set_ssa_val_to (lhs, result);

becase 'result' may be not an SSA name.  It might also not have
a proper definition statement (if VN_INFO (result)-needs_insertion
is true).  So you at least need to guard things properly.

(On a side-note - I _did_ want to remove value-numbering virtual operands
at some point ...)

@@ -3359,8 +3366,10 @@ visit_use (tree use)
  /* ???  We should handle stores from calls.  */
  else if (TREE_CODE (lhs) == SSA_NAME)
{
+ tree vuse = gimple_vuse (stmt);
  if (!gimple_call_internal_p (stmt)
-  gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
+  (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
+ || (vuse  SSA_VAL (vuse) != VN_TOP)))
changed = visit_reference_op_call (lhs, stmt);
  else
changed = defs_to_varying (stmt);

... exactly because of the issue that a stmt has multiple defs.  Btw,
vuse should have been visited here or be part of our SCC, so, why do
you need this check?

Thanks,
Richard.

 Thanks,
 - Tom

 2012-01-23  Tom de Vries  t...@codesourcery.com

        PR tree-optimization/51879
        tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
        (visit_use): Handle non-pure/const calls using visit_reference_op_call.

        gcc.dg/pr51879.c: New test.
        gcc.dg/pr51879-2.c: Same.
        gcc.dg/pr51879-3.c: Same.
        gcc.dg/pr51879-4.c: Same.


[PATCH] Fix for PR51879 - Missed tail merging with non-const/pure calls

2012-01-23 Thread Tom de Vries
Richard,
Jakub,

the following patch fixes PR51879.

Consider the following test-case:
...
int bar (int);
void baz (int);

void
foo (int y)
{
  int a;
  if (y == 6)
a = bar (7);
  else
a = bar (7);
  baz (a);
}
...

after compiling at -02, the representation looks like this before tail-merging:
...
  # BLOCK 3 freq:1991
  # PRED: 2 [19.9%]  (true,exec)
  # .MEMD.1714_7 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_3 = barD.1703 (7);
  goto bb 5;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:8009
  # PRED: 2 [80.1%]  (false,exec)
  # .MEMD.1714_8 = VDEF .MEMD.1714_6(D)
  # USE = nonlocal
  # CLB = nonlocal
  aD.1709_4 = barD.1703 (7);
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # aD.1709_1 = PHI aD.1709_3(3), aD.1709_4(4)
  # .MEMD.1714_5 = PHI .MEMD.1714_7(3), .MEMD.1714_8(4)
  # .MEMD.1714_9 = VDEF .MEMD.1714_5
  # USE = nonlocal
  # CLB = nonlocal
  bazD.1705 (aD.1709_1);
  # VUSE .MEMD.1714_9
  return;
...

the patch allows aD.1709_4 to be value numbered to aD.1709_3, and .MEMD.1714_8
to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.

The patch makes sure non-const/pure call results (gimple_vdef and
gimple_call_lhs) are properly value numbered.

Bootstrapped and reg-tested on x86_64.

ok for stage1?

Thanks,
- Tom

2012-01-23  Tom de Vries  t...@codesourcery.com

PR tree-optimization/51879
tree-ssa-sccvn.c (visit_reference_op_call): Handle gimple_vdef.
(visit_use): Handle non-pure/const calls using visit_reference_op_call.

gcc.dg/pr51879.c: New test.
gcc.dg/pr51879-2.c: Same.
gcc.dg/pr51879-3.c: Same.
gcc.dg/pr51879-4.c: Same.
Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c (revision 183325)
+++ gcc/tree-ssa-sccvn.c (working copy)
@@ -2591,6 +2591,7 @@ visit_reference_op_call (tree lhs, gimpl
   struct vn_reference_s vr1;
   tree result;
   tree vuse = gimple_vuse (stmt);
+  tree vdef = gimple_vdef (stmt);
 
   vr1.vuse = vuse ? SSA_VAL (vuse) : NULL_TREE;
   vr1.operands = valueize_shared_reference_ops_from_call (stmt);
@@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
   result = vn_reference_lookup_1 (vr1, NULL);
   if (result)
 {
-  changed = set_ssa_val_to (lhs, result);
+  tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
+  if (vdef)
+	changed |= set_ssa_val_to (vdef, result_vdef);
+  changed |= set_ssa_val_to (lhs, result);
+
   if (TREE_CODE (result) == SSA_NAME
 	   VN_INFO (result)-has_constants)
 	VN_INFO (lhs)-has_constants = true;
@@ -2609,7 +2614,9 @@ visit_reference_op_call (tree lhs, gimpl
 {
   void **slot;
   vn_reference_t vr2;
-  changed = set_ssa_val_to (lhs, lhs);
+  if (vdef)
+	changed |= set_ssa_val_to (vdef, vdef);
+  changed |= set_ssa_val_to (lhs, lhs);
   vr2 = (vn_reference_t) pool_alloc (current_info-references_pool);
   vr2-vuse = vr1.vuse;
   vr2-operands = valueize_refs (create_reference_ops_from_call (stmt));
@@ -3359,8 +3366,10 @@ visit_use (tree use)
 	  /* ???  We should handle stores from calls.  */
 	  else if (TREE_CODE (lhs) == SSA_NAME)
 	{
+	  tree vuse = gimple_vuse (stmt);
 	  if (!gimple_call_internal_p (stmt)
-		   gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST))
+		   (gimple_call_flags (stmt)  (ECF_PURE | ECF_CONST)
+		  || (vuse  SSA_VAL (vuse) != VN_TOP)))
 		changed = visit_reference_op_call (lhs, stmt);
 	  else
 		changed = defs_to_varying (stmt);
Index: gcc/testsuite/gcc.dg/pr51879-2.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-2.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-pre } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y)
+baz (bar (7) + 6);
+  else
+baz (bar (7) + 6);
+}
+
+/* { dg-final { scan-tree-dump-times bar \\( 1 pre} } */
+/* { dg-final { scan-tree-dump-times baz \\( 1 pre} } */
+/* { dg-final { cleanup-tree-dump pre } } */
Index: gcc/testsuite/gcc.dg/pr51879.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-pre } */
+
+int bar (int);
+void baz (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+a = bar (7);
+  else
+a = bar (7);
+  baz (a);
+}
+
+/* { dg-final { scan-tree-dump-times bar \\( 1 pre} } */
+/* { dg-final { cleanup-tree-dump pre } } */
Index: gcc/testsuite/gcc.dg/pr51879-3.c
===
--- /dev/null (new file)
+++ gcc/testsuite/gcc.dg/pr51879-3.c (revision 0)
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-pre } */
+
+int