The more aggressive threading across loop backedges requires invalidating equivalences that do not hold across all iterations of a loop.

At first glance, invaliding at PHI nodes should be sufficient as any statement which potentially generated a new equivalence would be reprocessed as we come across the backedge. However, there is one important case where that does not hold.

Specifically we might have derived a value from a conditional and the conditional might have been fed by a statement that doesn't produce useful equivalences (such as a GIMPLE_ASM). Thus the equivalence from the conditional is still visible because no new equivalence will be recorded for the GIMPLE_ASM.

So if the result of the GIMPLE_ASM that gets used in the conditional varies from one loop iteration to the next, we could use a stale value from a prior iteration to thread the current iteration. That's exactly what happens in the ffmpeg code.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also verified that the sample audio in the referenced BZs no longer chops off after ~2 seconds.

Installed on the trunk. OK for 4.9.1 after a suitable soak period on the trunk?



commit 02269351ce3a81b5470b8137fb3c34bca27011da
Author: Jeff Law <l...@redhat.com>
Date:   Wed Apr 23 00:25:47 2014 -0600

        PR tree-optimization/60902
        * tree-ssa-threadedge.c
        (record_temporary_equivalences_from_stmts_at_dest): Make sure to
        invalidate outputs from statements that do not produce useful
        outputs for threading.
    
        PR tree-optimization/60902
        * gcc.target/i386/pr60902.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 638c0da..ddebba7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2014-04-23  Jeff Law  <l...@redhat.com>
+
+       PR tree-optimization/60902
+       * tree-ssa-threadedge.c
+       (record_temporary_equivalences_from_stmts_at_dest): Make sure to
+       invalidate outputs from statements that do not produce useful
+       outputs for threading.
+
 2014-04-23 Venkataramanan Kumar  <venkataramanan.ku...@linaro.org>
 
        * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 126ad08..62b07f4 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-04-23  Jeff Law  <l...@redhat.com>
+
+       PR tree-optimization/60902
+       * gcc.target/i386/pr60902.c: New test.
+
 2014-04-23  Alex Velenko  <alex.vele...@arm.com>
 
        * gcc.target/aarch64/vdup_lane_1.c: New testcase.
diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c 
b/gcc/testsuite/gcc.target/i386/pr60902.c
new file mode 100644
index 0000000..b81dcd7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr60902.c
@@ -0,0 +1,32 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+extern void abort ();
+extern void exit (int);
+
+int x;
+
+foo()
+{
+  static int count;
+  count++;
+  if (count > 1)
+    abort ();
+}
+
+static inline int
+frob ()
+{
+  int a;
+  __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x));
+  x++;
+  return a;
+}
+
+int
+main ()
+{
+  int i;
+  for (i = 0; i < 10 && frob () == 0; i++)
+    foo();
+  exit (0);
+}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index c447b72..8a0103b 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge e,
           && (gimple_code (stmt) != GIMPLE_CALL
               || gimple_call_lhs (stmt) == NULL_TREE
               || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME))
-       continue;
+       {
+         /* STMT might still have DEFS and we need to invalidate any known
+            equivalences for them.
+
+            Consider if STMT is a GIMPLE_ASM with one or more outputs that
+            feeds a conditional inside a loop.  We might derive an equivalence
+            due to the conditional.  */
+         tree op;
+         ssa_op_iter iter;
+
+         if (backedge_seen)
+           FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS)
+             {
+               /* This call only invalidates equivalences created by
+                  PHI nodes.  This is by design to keep the cost of
+                  of invalidation reasonable.  */
+               invalidate_equivalences (op, stack, src_map, dst_map);
+
+               /* However, conditionals can imply values for real
+                  operands as well.  And those won't be recorded in the
+                  maps.  In fact, those equivalences may be recorded totally
+                  outside the threading code.  We can just create a new
+                  temporary NULL equivalence here.  */
+               record_temporary_equivalence (op, NULL_TREE, stack);
+             }
+
+         continue;
+       }
 
       /* The result of __builtin_object_size depends on all the arguments
         of a phi node. Temporarily using only one edge produces invalid

Reply via email to