On Wed, Apr 23, 2014 at 8:01 PM, Jeff Law <l...@redhat.com> wrote: > > The more aggressive threading across loop backedges requires invalidating > equivalences that do not hold across all iterations of a loop. > > At first glance, invaliding at PHI nodes should be sufficient as any > statement which potentially generated a new equivalence would be reprocessed > as we come across the backedge. However, there is one important case where > that does not hold. > > Specifically we might have derived a value from a conditional and the > conditional might have been fed by a statement that doesn't produce useful > equivalences (such as a GIMPLE_ASM). Thus the equivalence from the > conditional is still visible because no new equivalence will be recorded for > the GIMPLE_ASM. > > So if the result of the GIMPLE_ASM that gets used in the conditional varies > from one loop iteration to the next, we could use a stale value from a prior > iteration to thread the current iteration. That's exactly what happens in > the ffmpeg code. > > Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Also > verified that the sample audio in the referenced BZs no longer chops off > after ~2 seconds. > > Installed on the trunk. OK for 4.9.1 after a suitable soak period on the > trunk?
Ok, but ... > > > > commit 02269351ce3a81b5470b8137fb3c34bca27011da > Author: Jeff Law <l...@redhat.com> > Date: Wed Apr 23 00:25:47 2014 -0600 > > PR tree-optimization/60902 > * tree-ssa-threadedge.c > (record_temporary_equivalences_from_stmts_at_dest): Make sure to > invalidate outputs from statements that do not produce useful > outputs for threading. > > PR tree-optimization/60902 > * gcc.target/i386/pr60902.c: New test. > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 638c0da..ddebba7 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,11 @@ > +2014-04-23 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/60902 > + * tree-ssa-threadedge.c > + (record_temporary_equivalences_from_stmts_at_dest): Make sure to > + invalidate outputs from statements that do not produce useful > + outputs for threading. > + > 2014-04-23 Venkataramanan Kumar <venkataramanan.ku...@linaro.org> > > * config/aarch64/aarch64.md (stack_protect_set, stack_protect_test) > diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog > index 126ad08..62b07f4 100644 > --- a/gcc/testsuite/ChangeLog > +++ b/gcc/testsuite/ChangeLog > @@ -1,3 +1,8 @@ > +2014-04-23 Jeff Law <l...@redhat.com> > + > + PR tree-optimization/60902 > + * gcc.target/i386/pr60902.c: New test. > + > 2014-04-23 Alex Velenko <alex.vele...@arm.com> > > * gcc.target/aarch64/vdup_lane_1.c: New testcase. > diff --git a/gcc/testsuite/gcc.target/i386/pr60902.c > b/gcc/testsuite/gcc.target/i386/pr60902.c > new file mode 100644 > index 0000000..b81dcd7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr60902.c > @@ -0,0 +1,32 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2" } */ > +extern void abort (); > +extern void exit (int); > + > +int x; > + > +foo() > +{ > + static int count; > + count++; > + if (count > 1) > + abort (); > +} > + > +static inline int > +frob () > +{ > + int a; > + __asm__ ("mov %1, %0\n\t" : "=r" (a) : "m" (x)); > + x++; > + return a; > +} > + > +int > +main () > +{ > + int i; > + for (i = 0; i < 10 && frob () == 0; i++) > + foo(); > + exit (0); > +} > diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c > index c447b72..8a0103b 100644 > --- a/gcc/tree-ssa-threadedge.c > +++ b/gcc/tree-ssa-threadedge.c > @@ -387,7 +387,34 @@ record_temporary_equivalences_from_stmts_at_dest (edge > e, > && (gimple_code (stmt) != GIMPLE_CALL > || gimple_call_lhs (stmt) == NULL_TREE > || TREE_CODE (gimple_call_lhs (stmt)) != SSA_NAME)) > - continue; > + { > + /* STMT might still have DEFS and we need to invalidate any known > + equivalences for them. > + > + Consider if STMT is a GIMPLE_ASM with one or more outputs that > + feeds a conditional inside a loop. We might derive an > equivalence > + due to the conditional. */ > + tree op; > + ssa_op_iter iter; > + > + if (backedge_seen) > + FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_ALL_DEFS) You only need SSA_OP_DEF here, no need to process virtual operands. > + { > + /* This call only invalidates equivalences created by > + PHI nodes. This is by design to keep the cost of > + of invalidation reasonable. */ > + invalidate_equivalences (op, stack, src_map, dst_map); > + > + /* However, conditionals can imply values for real > + operands as well. And those won't be recorded in the > + maps. In fact, those equivalences may be recorded > totally > + outside the threading code. We can just create a new > + temporary NULL equivalence here. */ > + record_temporary_equivalence (op, NULL_TREE, stack); > + } > + > + continue; > + } > > /* The result of __builtin_object_size depends on all the arguments > of a phi node. Temporarily using only one edge produces invalid >