I was looking at the routelookup EEMBC benchmark and it has code of the form:

   while ( this_node->cmpbit > next_node->cmpbit )
    {
      this_node = next_node;

      if ( proute->dst_addr & (0x1 << this_node->cmpbit) )
         next_node = this_node->rlink;
      else
         next_node = this_node->llink;
    }

This is where you have a binary tree/trie and you are iterating going down
either the right link or left link until you find a stopping condition.  The
code in ifcvt.c does not handle optimizing these cases for conditional move
since the load might trap, and generates code that does if-then-else with loads
and jumps.

However, since the two elements are next to each other in memory, they are
likely in the same cache line, particularly with aligned stacks and malloc
returning aligned pointers.  Except in unusual circumstances where the pointer
is not aligned, this means it is much faster to optimize it as:

   while ( this_node->cmpbit > next_node->cmpbit )
    {
      this_node = next_node;

      rtmp = this_node->rlink;
      ltmp = this_node->llink;
      if ( proute->dst_addr & (0x1 << this_node->cmpbit) )
         next_node = rtmp;
      else
         next_node = ltmp;
    }

So I wrote some patches to do this optimization.  In ifcvt.c I added a new hook
that allows the backend to try and do conditional moves if the machine
independent code doesn't handle the special cases that the machine might have.

Then in rs6000.c I used that hook to see if the conditional moves are adjacent,
and do the optimization.

I will note that this type of code comes up quite frequently since binary trees
and tries are common data structure.  The file splay-tree.c in libiberty is one
place in the compiler tree that has conditional adjacent memory moves.

So I would like comments on the patch before the 4.8 tree opens up.  I feel
even if we decide not to add the adjacent memory move patch, the hook is
useful, and I have some other ideas for using it for the powerpc.

I was thinking about rewriting the rs6000 dependent parts to make it a normal
optimization available to all ports.  Is this something we want as a normal
option?  At the moment, I'm not sure it should be part of -O3 because it is
possible for a trap to occur if the pointer straddles a page boundary and the
test condition would guard against loading up the second value.  However,
-Ofast might be an appropriate place to do this optimization.

At this time I don't have test cases, but I would add them for the real
submission.  I have bootstraped the compiler on powerpc with this option
enabled and it passed the bootstrap and had no regressions in make check.  I
will do a spec run over the weekend as well.

In addition to libibery/splay-tree.c the following files in gcc have adjacent
conditional moves that this code would optimize:

            cfg.c
            c-typeck.c
            df-scan.c
            fold-const.c
            graphds.c
            ira-emit.c
            omp-low.c
            rs6000.c
            tree-cfg.c
            tree-ssa-dom.c
            tree-ssa-loop-ivops.c
            tree-ssa-phiopt.c
            tree-ssa-uncprop.c

2012-02-10  Michael Meissner  <meiss...@linux.vnet.ibm.com>

        * target.def (cmove_md_extra): New hook that is called from
        ifcvt.c to allow the backend to generate additional conditional
        moves that aren't handled by the machine independent code.  Add
        support to call the hook at the appropriate places.
        * targhooks.h (default_cmove_md_extra): Likewise.
        * targhooks.c (default_cmove_md_extra): Likewise.
        * target.h (enum ifcvt_pass): Likewise.
        * ifcvt.c (find_if_header): Likewise.
        (noce_find_if_block): Likewise.
        (struct noce_if_info): Likewise.
        (noce_process_if_block): Likewise.
        (cond_move_process_if_block): Likewise.
        (if_convert): Likewise.
        (rest_of_handle_if_conversion): Likewise.
        (rest_of_handle_if_after_combine): Likewise.
        (rest_of_handle_if_after_reload): Likewise.
        * doc/tm.texi (TARGET_CMOVE_MD_EXTRA): Likewise.
        * doc/tm.texi.in (TARGET_CMOVE_MD_EXTRA): Likewise.

        * config/rs6000/rs6000.opt (-mcmove-adjacent-memory): Add support
        for a new switch to optimize conditional moves where each side is
        a memory reference and the memory locations are adjacent.
        * config/rs6000/rs6000.c (rs6000_cmove_md_extra): Likewise.
        (TARGET_CMOVE_MD_EXTRA): Likewise.
        (rs6000_decompose_offsettable_memref): Likewise.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com     fax +1 (978) 399-6899
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi     (revision 184104)
+++ gcc/doc/tm.texi     (working copy)
@@ -10871,6 +10871,16 @@ conditional execution instructions inste
 1 if it does use cc0.
 @end defmac
 
+@deftypefn {Target Hook} rtx TARGET_CMOVE_MD_EXTRA (enum ifcvt_pass 
@var{if_pass}, rtx @var{dest}, rtx @var{compare}, rtx @var{op0}, rtx @var{op1})
+This function returns the insn for any machine specific conditional
+moves to set @var{dest} to @var{op0} if @var{compare} is non-zero or
+@var{op1} if it is zero.  The function returns a null pointer if there
+are no machine specific conditional moves.  The @var{if_pass} argument
+is @var{ifcvt_before_combine} on the first pass of if conversion,
+@var{ifcvt_after_cobmine} on the second pass of if conversion, and
+@var{ifcvt_after_reload} on the third pass of if conversion.
+@end deftypefn
+
 @defmac IFCVT_MODIFY_TESTS (@var{ce_info}, @var{true_expr}, @var{false_expr})
 Used if the target needs to perform machine-dependent modifications on the
 conditionals used for turning basic blocks into conditionally executed code.
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in  (revision 184104)
+++ gcc/doc/tm.texi.in  (working copy)
@@ -10749,6 +10749,16 @@ conditional execution instructions inste
 1 if it does use cc0.
 @end defmac
 
+@hook TARGET_CMOVE_MD_EXTRA
+This function returns the insn for any machine specific conditional
+moves to set @var{dest} to @var{op0} if @var{compare} is non-zero or
+@var{op1} if it is zero.  The function returns a null pointer if there
+are no machine specific conditional moves.  The @var{if_pass} argument
+is @var{ifcvt_before_combine} on the first pass of if conversion,
+@var{ifcvt_after_cobmine} on the second pass of if conversion, and
+@var{ifcvt_after_reload} on the third pass of if conversion.
+@end deftypefn
+
 @defmac IFCVT_MODIFY_TESTS (@var{ce_info}, @var{true_expr}, @var{false_expr})
 Used if the target needs to perform machine-dependent modifications on the
 conditionals used for turning basic blocks into conditionally executed code.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 184104)
+++ gcc/doc/invoke.texi (working copy)
@@ -839,7 +839,9 @@ See RS/6000 and PowerPC Options.
 -mno-recip-precision @gol
 -mveclibabi=@var{type} -mfriz -mno-friz @gol
 -mpointers-to-nested-functions -mno-pointers-to-nested-functions @gol
--msave-toc-indirect -mno-save-toc-indirect}
+-msave-toc-indirect -mno-save-toc-indirect @gol
+-mcmove-adjacent-memory -mno-cmove-adjacent-memory @gol
+-mcmove-adjacent-memory=@var{size}}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -17209,6 +17211,27 @@ stack location in the function prologue 
 a pointer on AIX and 64-bit Linux systems.  If the TOC value is not
 saved in the prologue, it is saved just before the call through the
 pointer.  The @option{-mno-save-toc-indirect} option is the default.
+
+@item -mcmove-adjacent-memory
+@itemx -mno-cmove-adjacent-memory
+@item -mcmove-adjacent-memory=@var{size}
+@opindex mcmove-adjacent-memory
+Generate (do not generate) code to optimize conditional moves where
+the true and false parts of the conditional move are in adjacent
+memory locations.  The default is
+@option{-mno-cmove-adjacent-memory}.  This option depends on the
+@option{-misel} option being set.
+
+This option is intended to speed up code that uses linked lists using
+binary trees with right and left pointers.  Typically, the code goes
+through the list looking for an element and at each point in the
+search it decides to go down the right or left .  The right and left
+pointers are usually adjcent in the structure, and in the same cache
+line.
+
+The argument @var{size} is used to override the default of the number
+of bytes that the compiler uses to decide if two fields are adjacent.
+By default, the value @var{16} is used.
 @end table
 
 @node RX Options
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c     (revision 184104)
+++ gcc/targhooks.c     (working copy)
@@ -1,5 +1,5 @@
 /* Default target hook functions.
-   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
+   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -1449,4 +1449,17 @@ default_pch_valid_p (const void *data_p,
   return NULL;
 }
 
+/* Machine specific conditional moves that are not handled by the machine
+   independent code.  */
+
+rtx
+default_cmove_md_extra (enum ifcvt_pass if_pass ATTRIBUTE_UNUSED,
+                       rtx dest ATTRIBUTE_UNUSED,
+                       rtx compare ATTRIBUTE_UNUSED,
+                       rtx op0 ATTRIBUTE_UNUSED,
+                       rtx op1 ATTRIBUTE_UNUSED)
+{
+  return NULL_RTX;
+}
+
 #include "gt-targhooks.h"
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h     (revision 184104)
+++ gcc/targhooks.h     (working copy)
@@ -1,5 +1,5 @@
 /* Default target hook functions.
-   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011
+   Copyright (C) 2003, 2004, 2005, 2007, 2008, 2009, 2010, 2011, 2012
    Free Software Foundation, Inc.
 
 This file is part of GCC.
@@ -178,3 +178,4 @@ extern enum machine_mode default_get_reg
 
 extern void *default_get_pch_validity (size_t *);
 extern const char *default_pch_valid_p (const void *, size_t);
+extern rtx default_cmove_md_extra (enum ifcvt_pass, rtx, rtx, rtx, rtx);
Index: gcc/target.def
===================================================================
--- gcc/target.def      (revision 184104)
+++ gcc/target.def      (working copy)
@@ -1534,6 +1534,14 @@ DEFHOOK
  bool, (struct ao_ref_s *ref),
  default_ref_may_alias_errno)
 
+/* Machine specific conditional moves that are not handled by the machine
+   independent code.  */
+DEFHOOK
+(cmove_md_extra,
+ "",
+ rtx, (enum ifcvt_pass if_pass, rtx dest, rtx compare, rtx op0, rtx op1),
+ default_cmove_md_extra)
+
 /* Support for named address spaces.  */
 #undef HOOK_PREFIX
 #define HOOK_PREFIX "TARGET_ADDR_SPACE_"
Index: gcc/target.h
===================================================================
--- gcc/target.h        (revision 184104)
+++ gcc/target.h        (working copy)
@@ -149,6 +149,14 @@ enum vect_cost_for_stmt
   vec_promote_demote
 };
 
+/* Passes of if-conversion.  */
+enum ifcvt_pass
+{
+  ifcvt_before_combine,
+  ifcvt_after_combine,
+  ifcvt_after_reload
+};
+
 /* The target structure.  This holds all the backend hooks.  */
 #define DEFHOOKPOD(NAME, DOC, TYPE, INIT) TYPE NAME;
 #define DEFHOOK(NAME, DOC, TYPE, PARAMS, INIT) TYPE (* NAME) PARAMS;
Index: gcc/ifcvt.c
===================================================================
--- gcc/ifcvt.c (revision 184104)
+++ gcc/ifcvt.c (working copy)
@@ -1,6 +1,6 @@
 /* If-conversion support.
    Copyright (C) 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2010,
-   2011
+   2011, 2012
    Free Software Foundation, Inc.
 
    This file is part of GCC.
@@ -97,9 +97,9 @@ static rtx noce_get_condition (rtx, rtx 
 static int noce_operand_ok (const_rtx);
 static void merge_if_block (ce_if_block_t *);
 static int find_cond_trap (basic_block, edge, edge);
-static basic_block find_if_header (basic_block, int);
+static basic_block find_if_header (basic_block, int, enum ifcvt_pass);
 static int block_jumps_and_fallthru_p (basic_block, basic_block);
-static int noce_find_if_block (basic_block, edge, edge, int);
+static int noce_find_if_block (basic_block, edge, edge, int, enum ifcvt_pass);
 static int cond_exec_find_if_block (ce_if_block_t *);
 static int find_if_case_1 (basic_block, edge, edge);
 static int find_if_case_2 (basic_block, edge, edge);
@@ -762,6 +762,9 @@ struct noce_if_info
 
   /* Estimated cost of the particular branch instruction.  */
   int branch_cost;
+
+  /* Which ifcvt pass this is.  */
+  enum ifcvt_pass if_pass;
 };
 
 static rtx noce_emit_store_flag (struct noce_if_info *, rtx, int, int);
@@ -2438,7 +2441,7 @@ noce_process_if_block (struct noce_if_in
   basic_block join_bb = if_info->join_bb;      /* JOIN */
   rtx jump = if_info->jump;
   rtx cond = if_info->cond;
-  rtx insn_a, insn_b;
+  rtx insn_a, insn_b, insn;
   rtx set_a, set_b;
   rtx orig_x, x, a, b;
 
@@ -2533,7 +2536,19 @@ noce_process_if_block (struct noce_if_in
 
   /* Don't operate on sources that may trap or are volatile.  */
   if (! noce_operand_ok (a) || ! noce_operand_ok (b))
-    return FALSE;
+    {
+      /* Give the back end a chance to handle conditional moves that aren't
+        supported here.  The machine still might have some way of handling
+        sources that trap or are volatile.  */
+      insn = targetm.cmove_md_extra (if_info->if_pass, x, cond, a, b);
+      if (insn)
+       {
+         emit_insn_before_setloc (insn, jump, INSN_LOCATOR (insn_a));
+         goto success;
+       }
+
+      return FALSE;
+    }
 
  retry:
   /* Set up the info block for our subroutines.  */
@@ -2633,6 +2648,16 @@ noce_process_if_block (struct noce_if_in
        goto success;
     }
 
+  /* Give the back end a chance to handle conditional moves that aren't
+     supported here.  */
+  insn = targetm.cmove_md_extra (if_info->if_pass, x, cond, a, b);
+  if (insn)
+    {
+      emit_insn_before_setloc (insn, if_info->jump,
+                              INSN_LOCATOR (if_info->insn_a));
+      goto success;
+    }
+
   if (!else_bb && set_b)
     {
       insn_b = set_b = NULL_RTX;
@@ -2976,7 +3001,7 @@ cond_move_process_if_block (struct noce_
 
 static int
 noce_find_if_block (basic_block test_bb, edge then_edge, edge else_edge,
-                   int pass)
+                   int pass, enum ifcvt_pass if_pass)
 {
   basic_block then_bb, else_bb, join_bb;
   bool then_else_reversed = false;
@@ -3076,6 +3101,7 @@ noce_find_if_block (basic_block test_bb,
   if_info.then_else_reversed = then_else_reversed;
   if_info.branch_cost = BRANCH_COST (optimize_bb_for_speed_p (test_bb),
                                     predictable_edge_p (then_edge));
+  if_info.if_pass = if_pass;
 
   /* Do the real work.  */
 
@@ -3207,7 +3233,7 @@ merge_if_block (struct ce_if_block * ce_
    first block if some transformation was done.  Return NULL otherwise.  */
 
 static basic_block
-find_if_header (basic_block test_bb, int pass)
+find_if_header (basic_block test_bb, int pass, enum ifcvt_pass if_pass)
 {
   ce_if_block_t ce_info;
   edge then_edge;
@@ -3259,7 +3285,7 @@ find_if_header (basic_block test_bb, int
 #endif
 
   if (!reload_completed
-      && noce_find_if_block (test_bb, then_edge, else_edge, pass))
+      && noce_find_if_block (test_bb, then_edge, else_edge, pass, if_pass))
     goto success;
 
   if (reload_completed
@@ -4334,7 +4360,7 @@ dead_or_predicable (basic_block test_bb,
 /* Main entry point for all if-conversion.  */
 
 static void
-if_convert (void)
+if_convert (enum ifcvt_pass if_pass)
 {
   basic_block bb;
   int pass;
@@ -4380,7 +4406,7 @@ if_convert (void)
        {
           basic_block new_bb;
           while (!df_get_bb_dirty (bb)
-                 && (new_bb = find_if_header (bb, pass)) != NULL)
+                 && (new_bb = find_if_header (bb, pass, if_pass)) != NULL)
             bb = new_bb;
        }
 
@@ -4450,7 +4476,7 @@ rest_of_handle_if_conversion (void)
       if (dump_file)
         dump_flow_info (dump_file, dump_flags);
       cleanup_cfg (CLEANUP_EXPENSIVE);
-      if_convert ();
+      if_convert (ifcvt_before_combine);
     }
 
   cleanup_cfg (0);
@@ -4490,7 +4516,7 @@ gate_handle_if_after_combine (void)
 static unsigned int
 rest_of_handle_if_after_combine (void)
 {
-  if_convert ();
+  if_convert (ifcvt_after_combine);
   return 0;
 }
 
@@ -4525,7 +4551,7 @@ gate_handle_if_after_reload (void)
 static unsigned int
 rest_of_handle_if_after_reload (void)
 {
-  if_convert ();
+  if_convert (ifcvt_after_reload);
   return 0;
 }
 
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt        (revision 184104)
+++ gcc/config/rs6000/rs6000.opt        (working copy)
@@ -1,7 +1,7 @@
 ; Options for the rs6000 port of the compiler
 ;
-; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software
-; Foundation, Inc.
+; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012
+; Free Software Foundation, Inc.
 ; Contributed by Aldy Hernandez <a...@quesejoda.com>.
 ;
 ; This file is part of GCC.
@@ -532,3 +532,11 @@ Use/do not use r11 to hold the static li
 msave-toc-indirect
 Target Report Var(TARGET_SAVE_TOC_INDIRECT) Save
 Control whether we save the TOC in the prologue for indirect calls or generate 
the save inline
+
+mcmove-adjacent-memory
+Target Report Var(TARGET_CMOVE_ADJACENT_MEMORY,16) Init(0) Save
+Control whether we load both sides of a conditional move if they are adjacent 
in memory.
+
+mcmove-adjacent-memory=
+Target Report RejectNegative Joined UInteger Var(TARGET_CMOVE_ADJACENT_MEMORY)
+Modify the line size for figuring out if the two memory items are adjacent.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c  (revision 184104)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -1229,6 +1229,7 @@ static bool rs6000_legitimate_constant_p
 static bool rs6000_save_toc_in_prologue_p (void);
 static void rs6000_code_end (void) ATTRIBUTE_UNUSED;
 static void rs6000_set_up_by_prologue (struct hard_reg_set_container *);
+static rtx rs6000_cmove_md_extra (enum ifcvt_pass, rtx, rtx, rtx, rtx);
 
 /* Hash table stuff for keeping track of TOC entries.  */
 
@@ -1672,6 +1673,9 @@ static const struct attribute_spec rs600
 
 #undef TARGET_VECTORIZE_VEC_PERM_CONST_OK
 #define TARGET_VECTORIZE_VEC_PERM_CONST_OK rs6000_vectorize_vec_perm_const_ok
+
+#undef TARGET_CMOVE_MD_EXTRA
+#define TARGET_CMOVE_MD_EXTRA rs6000_cmove_md_extra
 
 
 /* Simplifications for entries below.  */
@@ -16348,6 +16352,138 @@ rs6000_emit_vector_cond_expr (rtx dest, 
   return 1;
 }
 
+/* If a memory reference is an offsettable memory reference, return true and
+   return the base part and offset.  */
+
+static inline bool
+rs6000_decompose_offsettable_memref (rtx op, rtx *base, HOST_WIDE_INT *offset)
+{
+  rtx addr, right, left;
+
+  *base = NULL_RTX;
+  *offset = -1;
+
+  if (!MEM_P (op) || side_effects_p (op) || volatile_refs_p (op))
+    return false;
+
+  addr = XEXP (op, 0);
+  if (GET_CODE (addr) == REG)
+    {
+      *base = addr;
+      *offset = 0;
+      return true;
+    }
+  else if (GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM)
+    {
+      right = XEXP (addr, 0);
+      left = XEXP (addr, 1);
+      if (!REG_P (right) || GET_CODE (left) != CONST_INT)
+       return false;
+
+      *base = right;
+      *offset = INTVAL (left);
+      return true;
+    }
+
+  return false;
+}
+
+/* Do any machine specific optimization of conditional moves.  On the rs6000,
+   if both sides of the conditional move are memory operations that seem to be
+   within the same cache block, then the second load is 'free', and we use
+   load/load/isel if the user enables the option.  This is useful for balanced
+   binary trees/tries where you chase through a list following either the right
+   or left pointers.  If the pointer to the memory address aren't appropriately
+   aligned, then this could potentially cause a page fault or a trap.  */
+
+static rtx
+rs6000_cmove_md_extra (enum ifcvt_pass if_pass, rtx dest, rtx compare, rtx op0,
+                      rtx op1)
+{
+  enum machine_mode mode0 = GET_MODE (op0);
+  enum machine_mode mode1 = GET_MODE (op1);
+  HOST_WIDE_INT size = GET_MODE_SIZE (mode0);
+  HOST_WIDE_INT op0_size = size;
+  HOST_WIDE_INT op1_size = size;
+  HOST_WIDE_INT line_size = TARGET_CMOVE_ADJACENT_MEMORY;
+  HOST_WIDE_INT offset0 = -1;
+  HOST_WIDE_INT offset1 = -1;
+  rtx op0_no_extend = op0;
+  rtx op1_no_extend = op1;
+  rtx ret = NULL_RTX;
+  rtx base0, base1, cmp_op0, cmp_op1;
+  enum rtx_code cmp_code = GET_CODE (compare);
+
+  if (if_pass == ifcvt_after_reload)
+    return NULL_RTX;
+
+  /* Allow sign/zero extend, but only do before combine, since the force_reg
+     will split the load and the sign/zero extend.  Combine will join these
+     back together.  */
+  if (if_pass == ifcvt_before_combine)
+    {
+      if (GET_CODE (op0) == SIGN_EXTEND || GET_CODE (op0) == ZERO_EXTEND)
+       {
+         op0_no_extend = XEXP (op0, 0);
+         op0_size = GET_MODE_SIZE (GET_MODE (op0_no_extend));
+       }
+
+      if (GET_CODE (op1) == SIGN_EXTEND || GET_CODE (op1) == ZERO_EXTEND)
+       {
+         op1_no_extend = XEXP (op1, 0);
+         op1_size = GET_MODE_SIZE (GET_MODE (op1_no_extend));
+       }
+    }
+
+  if (!TARGET_ISEL || line_size < 2*size || mode0 != mode1
+      || !MEM_P (op0_no_extend) || !MEM_P (op1_no_extend)
+      || !INTEGRAL_MODE_P (mode0))
+    return NULL_RTX;
+
+  /* Only do this optimization for simple comparisons.  */
+  if (GET_RTX_CLASS (cmp_code) != RTX_COMPARE
+      && GET_RTX_CLASS (cmp_code) != RTX_COMM_COMPARE)
+    return NULL_RTX;
+
+  cmp_op0 = XEXP (compare, 0);
+  if (!REG_P (cmp_op0))
+    return NULL_RTX;
+
+  cmp_op1 = XEXP (compare, 1);
+  if (!REG_P (cmp_op1) && cmp_op1 != const0_rtx)
+    return NULL_RTX;
+
+  /* Are memory references reg+offset?  */
+  if (!rs6000_decompose_offsettable_memref (op0_no_extend, &base0, &offset0))
+    return NULL_RTX;
+
+  if (!rs6000_decompose_offsettable_memref (op1_no_extend, &base1, &offset1))
+    return NULL_RTX;
+
+  /* Are the base pointers the same?  */
+  if (!rtx_equal_p (base0, base1))
+    return NULL_RTX;
+
+  /* Are the two fields aligned on a word/double word boundary?  */
+  if ((offset0 % op0_size) != 0 || (offset1 % op1_size) != 0)
+    return NULL_RTX;
+
+  /* Are the memory locations in different cache lines?  */
+  if ((offset0 / line_size) != (offset1 / line_size))
+    return NULL_RTX;
+
+  start_sequence ();
+  /* We reverse op0/op1 here because we are replacing an if statement, where it
+     jumps to do the ELSE part if the condition is true.  */
+  if (rs6000_emit_int_cmove (dest,
+                            copy_rtx (compare),
+                            force_reg (mode1, copy_rtx (op1)),
+                            force_reg (mode0, copy_rtx (op0))))
+    ret = get_insns ();
+  end_sequence ();
+  return ret;
+}
+
 /* Emit a conditional move: move TRUE_COND to DEST if OP of the
    operands of the last comparison is nonzero/true, FALSE_COND if it
    is zero/false.  Return 0 if the hardware has no such operation.  */

Reply via email to