As noted in the last patch, rs6000_rtx_costs ought to cost slow
unaligned mems.  This stops combine merging loads/stores with a
mode-changing SET subreg, if the load/store in the subreg mode would
be slow.  Costing slow mems at 100 insns is just an order of magnitude
estimate.  (The alignment interrupt does cost quite a lot.
Experiments on power8 with a misaligned lwarx showed taking the
alignment interrupt cost roughly 300 insns.)

Boostrapped and regression tested powerpc64le-linux and
powerpc64-linux.

        * config/rs6000/rs6000.c (rs6000_rtx_costs): Make unaligned mem
        cost more.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 5b9aae2..2ae3e7e 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -34336,11 +34336,16 @@ rs6000_rtx_costs (rtx x, machine_mode mode, int 
outer_code,
     case CONST:
     case HIGH:
     case SYMBOL_REF:
+      *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2);
+      return true;
+
     case MEM:
       /* When optimizing for size, MEM should be slightly more expensive
         than generating address, e.g., (plus (reg) (const)).
         L1 cache latency is about two instructions.  */
       *total = !speed ? COSTS_N_INSNS (1) + 1 : COSTS_N_INSNS (2);
+      if (SLOW_UNALIGNED_ACCESS (mode, MEM_ALIGN (x)))
+       *total += COSTS_N_INSNS (100);
       return true;
 
     case LABEL_REF:

-- 
Alan Modra
Australia Development Lab, IBM

Reply via email to