The Power processor has the ability to fuse certain pairs of dependent
instructions to improve their performance if they appear back-to-back in
the instruction stream. In looking at the current support for
instruction fusion in GCC I saw the following 2 options.

1) TARGET_SCHED_MACRO_FUSION target hooks: Only looks at existing
back-to-back instructions and will ensure the scheduler keeps them together.

2) -fsched-fusion/TARGET_SCHED_FUSION_PRIORITY: Runs as a separate
scheduling pass before peephole2. Operates independently on a single
insn. Used by ARM backend to assign higher priorities to base/disp loads
and stores so that the scheduling pass will schedule loads/stores to
adjacent memory back-to-back. Later these insns will be transformed into
load/store pair insns.

Neither of these work for Power's purpose because they don't deal with
fusion of dependent insns that may not already be back-to-back. The
TARGET_SCHED_REORDER[2] hooks also don't work since the dependent insn
more than likely gets queued for N cycles so wouldn't be on the ready
list for the reorder hooks to process. We want the ability for the
scheduler to schedule dependent insn pairs back-to-back when possible
(i.e. other dependencies of both insns have been satisfied).

I have coded up a proof of concept that implements our needs via a new
target hook. The hook is passed a pair of dependent insns and returns if
they are a fusion candidate. It is called while removing the forward
dependencies of the just scheduled insn. If a dependent insn becomes
available to schedule and it's a fusion candidate with the just
scheduled insn, then the new code moves it to the ready list (if
necessary) and marks it as SCHED_GROUP (piggy-backing on the existing
code used by TARGET_SCHED_MACRO_FUSION) to make sure the fusion
candidate will be scheduled next. Following is the scheduling part of
the diff. Does this sound like a feasible approach? I welcome any
comments/discussion.

Thanks,
Pat


diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 80687fb5359..7a62136d497 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -4152,6 +4152,39 @@ schedule_insn (rtx_insn *insn)
              && SCHED_GROUP_P (next)
              && advance < effective_cost)
            advance = effective_cost;
+
+         /* If all dependencies of this insn have now been resolved and the
+            just scheduled insn was not part of a SCHED_GROUP, check if
+            this dependent insn can be fused with the just scheduled insn.  */
+         else if (effective_cost >= 0 && !SCHED_GROUP_P (insn)
+                  && targetm.sched.dep_fusion
+                  && targetm.sched.dep_fusion (insn, next))
+           {
+             /* Move to ready list if necessary.  */
+             if (effective_cost > 0)
+               {
+                 queue_remove (next);
+                 ready_add (&ready, next, true);
+               }
+
+             /* Mark as sched_group.  */
+             SCHED_GROUP_P (next) = 1;
+
+             /* Fix insn_tick.  */
+             INSN_TICK (next) = INSN_TICK (insn);
+
+             /* Dump some debug output for success.  */
+             if (sched_verbose >= 5)
+               {
+                 fprintf (sched_dump, ";;\t\tFusing dependent insns: ");
+                 fprintf (sched_dump, "%4d %-30s --> ", INSN_UID (insn),
+                          str_pattern_slim (PATTERN (insn)));
+                 fprintf (sched_dump, "%4d %-30s\n", INSN_UID (next),
+                          str_pattern_slim (PATTERN (next)));
+               }
+           }
        }
       else
        /* Check always has only one forward dependence (to the first insn in

Reply via email to