Right now jump threads from within a loop which cross the loop header, then terminate within the loop are ignored as this may create irreducible loops.

This patch gets the CFG/SSA updating code in good enough shape to handle the case that the embedded guys care about. ie, the coremark FSA/FSM benchmark.

Basically we still ignore most of those threads -- the exception is when we find the loop header has a multi-way branch. In that case we go ahead and thread the jump, throw away and rebuild the loop information. Again, we only do this when we're able to remove a multi-way jump at the top of a loop.

During a GCC bootstrap I found two instances where this triggers, once in GCC itself, one in the java runtime libraries. I've manually verified that we do the right thing in those two instances as well as the FSA/FSM optimization. However, the reality is this code isn't being well exercised.

So (for testing purposes only) I removed the test that the loop header has to be a multi-way jump and enabled a follow-on patch which allows us to identify many more threads through the loop header to points within the current loop. Not surprisingly the code triggered more often and uncovered a couple minor issues that I've addressed.

The bootstrap/regression test for the patch ran fine, as did the regression and bootstrap test with various pass limitations removed (with the exception of tests which explicitly test that we don't thread in situations where it'll create irreducible loops). This is good.

Installed on the trunk. One more patch to go before calling and end to my own development work for 4.9 :-)





        * tree-ssa-threadupdate.c: Include tree-cfg.h and tree-pass.h
        (thread_block_1): Do not cancel jump threads which go from
        inside a loop, through the header, then back inside the loop.
        (bb_ends_with_multiway_branch): New function.
        (thread_through_all_blocks): Handle threading cases which start
        in a loop through the loop header to a point in the loop.



diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 777fe41..38d4f69 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -35,6 +35,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "hash-table.h"
 #include "dbgcnt.h"
+#include "tree-cfg.h"
+#include "tree-pass.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
    one or more in-edges to B to instead reach the destination of an
@@ -815,29 +817,15 @@ thread_block_1 (basic_block bb, bool noloop_only, bool 
joiners)
            }
 
          /* Another case occurs when trying to thread through our
-            own loop header, possibly from inside the loop.
-
-            If our loop header is buried in the path, then go ahead
-            and cancel the jump threading request here.  This likely
-            will need updating for the FSA/FSM coremark case.
-
-            Other cases (BB is the loop header) are handled elsewhere.  */
+            own loop header, possibly from inside the loop.  We will
+            thread these later.  */
          unsigned int i;
          for (i = 1; i < path->length (); i++)
            {
              if ((*path)[i]->e->src == bb->loop_father->header
                  && (!loop_exit_edge_p (bb->loop_father, e2)
                      || (*path)[1]->type == EDGE_COPY_SRC_JOINER_BLOCK))
-               {
-                 /* If i != 1, then it's a buried header that will not
-                    be handled elsehwere.  */
-                 if (i != 1)
-                   {
-                     delete_jump_thread_path (path);
-                     e->aux = NULL;
-                   }
-                 break;
-               }
+               break;
            }
 
          if (i != path->length ())
@@ -1554,6 +1542,20 @@ mark_threaded_blocks (bitmap threaded_blocks)
 }
 
 
+/* Return TRUE if BB ends with a switch statement or a computed goto.
+   Otherwise return false.  */
+static bool
+bb_ends_with_multiway_branch (basic_block bb ATTRIBUTE_UNUSED)
+{
+  gimple stmt = last_stmt (bb);
+  if (stmt && gimple_code (stmt) == GIMPLE_SWITCH)
+    return true;
+  if (stmt && gimple_code (stmt) == GIMPLE_GOTO
+      && TREE_CODE (gimple_goto_dest (stmt)) == SSA_NAME)
+    return true;
+  return false;
+}
+
 /* Walk through all blocks and thread incoming edges to the appropriate
    outgoing edge for each edge pair recorded in THREADED_EDGES.
 
@@ -1573,6 +1575,7 @@ thread_through_all_blocks (bool may_peel_loop_headers)
   bitmap_iterator bi;
   bitmap threaded_blocks;
   struct loop *loop;
+  bool totally_clobbered_loops = false;
 
   /* We must know about loops in order to preserve them.  */
   gcc_assert (current_loops != NULL);
@@ -1609,35 +1612,79 @@ thread_through_all_blocks (bool may_peel_loop_headers)
       retval |= thread_through_loop_header (loop, may_peel_loop_headers);
     }
 
-  /* Assume we had a jump thread path which went from the latch to the exit
-     and a path which goes from outside to inside the same loop.
-
-     If the latch to exit was handled first, we will thread it and clear
-     loop->header.
-
-     The second path will be ignored by thread_block because we're going
-     through a loop header.  It will also be ignored by the loop above
-     because loop->header is NULL.
-
-     This results in the second path never being threaded.  The failure
-     mode is a dangling AUX field.
-
-     This is inherently a bit of a pain to fix, so we just walk all the
-     blocks and all the incoming edges to those blocks and clear their
-     AUX fields.  */
+  /* Any jump threading paths that are still attached to edges at this
+     point must be one of two cases.
+
+     First, we could have a jump threading path which went from outside
+     a loop to inside a loop that was ignored because a prior jump thread
+     across a backedge was realized (which indirectly causes the loop
+     above to ignore the latter thread).  We can detect these because the
+     loop structures will be different and we do not currently try to
+     optimize this case.
+
+     Second, we could be threading across a backedge to a point within the
+     same loop.  This occurrs for the FSA/FSM optimization and we would
+     like to optimize it.  However, we have to be very careful as this
+     may completely scramble the loop structures, with the result being
+     irreducible loops causing us to throw away our loop structure.
+
+     As a compromise for the latter case, if the thread path ends in
+     a block where the last statement is a multiway branch, then go
+     ahead and thread it, else ignore it.  */
   basic_block bb;
-  edge_iterator ei;
   edge e;
   FOR_EACH_BB (bb)
     {
-      FOR_EACH_EDGE (e, ei, bb->preds)
+      /* If we do end up threading here, we can remove elements from
+        BB->preds.  Thus we can not use the FOR_EACH_EDGE iterator.  */
+      for (edge_iterator ei = ei_start (bb->preds);
+          (e = ei_safe_edge (ei));)
        if (e->aux)
          {
            vec<jump_thread_edge *> *path = THREAD_PATH (e);
 
-           delete_jump_thread_path (path);
-           e->aux = NULL;
+           /* Case 1, threading from outside to inside the loop
+              after we'd already threaded through the header.  */
+           if ((*path)[0]->e->dest->loop_father
+               != path->last ()->e->src->loop_father)
+             {
+               delete_jump_thread_path (path);
+               e->aux = NULL;
+               ei_next (&ei);
+             }
+          else if (bb_ends_with_multiway_branch (path->last ()->e->src))
+             {
+               /* The code to thread through loop headers may have
+                  split a block with jump threads attached to it.
+
+                  We can identify this with a disjoint jump threading
+                  path.  If found, just remove it.  */
+               for (unsigned int i = 0; i < path->length () - 1; i++)
+                 if ((*path)[i]->e->dest != (*path)[i + 1]->e->src)
+                   {
+                     delete_jump_thread_path (path);
+                     e->aux = NULL;
+                     ei_next (&ei);
+                     break;
+                   }
+
+               /* Our path is still valid, thread it.  */
+               if (e->aux)
+                 {
+                   totally_clobbered_loops
+                     |= thread_block ((*path)[0]->e->dest, false);
+                   e->aux = NULL;
+                 }
+             }
+          else
+             {
+               delete_jump_thread_path (path);
+               e->aux = NULL;
+               ei_next (&ei);
+             }
          }
+       else
+         ei_next (&ei);
     }
 
   statistics_counter_event (cfun, "Jumps threaded",
@@ -1649,7 +1696,32 @@ thread_through_all_blocks (bool may_peel_loop_headers)
   threaded_blocks = NULL;
   paths.release ();
 
-  if (retval)
+  /* If we made changes to the CFG that might have totally messed
+     up the loop structure, then drop the old loop structure and
+     rebuild.  */
+  if (totally_clobbered_loops)
+    {
+      /* Release the current loop structures, they are totally
+        clobbered at this point.  */
+      loop_optimizer_finalize ();
+      current_loops = NULL;
+
+      /* Similarly for dominance information.  */
+      free_dominance_info (CDI_DOMINATORS);
+      free_dominance_info (CDI_POST_DOMINATORS);
+
+      /* Before we can rebuild the loop structures, we need dominators,
+        which requires no unreachable code.  So remove unreachable code.  */
+      delete_unreachable_blocks ();
+
+      /* Now rebuild the loop structures.  */
+      cfun->curr_properties &= ~PROP_loops;
+      loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
+      cfun->curr_properties |= PROP_loops;
+      retval = 1;
+    }
+
+  if (retval && current_loops)
     loops_state_set (LOOPS_NEED_FIXUP);
 
   return retval;

Reply via email to