The Power processor has the ability to fuse certain pairs of dependent instructions to improve their performance if they appear back-to-back in the instruction stream. In looking at the current support for instruction fusion in GCC I saw the following 2 options.
1) TARGET_SCHED_MACRO_FUSION target hooks: Only looks at existing back-to-back instructions and will ensure the scheduler keeps them together. 2) -fsched-fusion/TARGET_SCHED_FUSION_PRIORITY: Runs as a separate scheduling pass before peephole2. Operates independently on a single insn. Used by ARM backend to assign higher priorities to base/disp loads and stores so that the scheduling pass will schedule loads/stores to adjacent memory back-to-back. Later these insns will be transformed into load/store pair insns. Neither of these work for Power's purpose because they don't deal with fusion of dependent insns that may not already be back-to-back. The TARGET_SCHED_REORDER[2] hooks also don't work since the dependent insn more than likely gets queued for N cycles so wouldn't be on the ready list for the reorder hooks to process. We want the ability for the scheduler to schedule dependent insn pairs back-to-back when possible (i.e. other dependencies of both insns have been satisfied). I have coded up a proof of concept that implements our needs via a new target hook. The hook is passed a pair of dependent insns and returns if they are a fusion candidate. It is called while removing the forward dependencies of the just scheduled insn. If a dependent insn becomes available to schedule and it's a fusion candidate with the just scheduled insn, then the new code moves it to the ready list (if necessary) and marks it as SCHED_GROUP (piggy-backing on the existing code used by TARGET_SCHED_MACRO_FUSION) to make sure the fusion candidate will be scheduled next. Following is the scheduling part of the diff. Does this sound like a feasible approach? I welcome any comments/discussion. Thanks, Pat diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 80687fb5359..7a62136d497 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -4152,6 +4152,39 @@ schedule_insn (rtx_insn *insn) && SCHED_GROUP_P (next) && advance < effective_cost) advance = effective_cost; + + /* If all dependencies of this insn have now been resolved and the + just scheduled insn was not part of a SCHED_GROUP, check if + this dependent insn can be fused with the just scheduled insn. */ + else if (effective_cost >= 0 && !SCHED_GROUP_P (insn) + && targetm.sched.dep_fusion + && targetm.sched.dep_fusion (insn, next)) + { + /* Move to ready list if necessary. */ + if (effective_cost > 0) + { + queue_remove (next); + ready_add (&ready, next, true); + } + + /* Mark as sched_group. */ + SCHED_GROUP_P (next) = 1; + + /* Fix insn_tick. */ + INSN_TICK (next) = INSN_TICK (insn); + + /* Dump some debug output for success. */ + if (sched_verbose >= 5) + { + fprintf (sched_dump, ";;\t\tFusing dependent insns: "); + fprintf (sched_dump, "%4d %-30s --> ", INSN_UID (insn), + str_pattern_slim (PATTERN (insn))); + fprintf (sched_dump, "%4d %-30s\n", INSN_UID (next), + str_pattern_slim (PATTERN (next))); + } + } } else /* Check always has only one forward dependence (to the first insn in