Re: [PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

Richard Sandiford Wed, 13 Dec 2023 15:34:03 -0800

Alex Coplan <alex.cop...@arm.com> writes:
> Hi,
>
> In PR112906 we ICE because we try to use force_reg to reload an
> auto-increment address, but force_reg can't do this.
>
> With the aim of fixing the PR by supporting reloading arbitrary
> addresses in pre-RA splitters, this patch generalizes
> lra-constraints.cc:emit_inc and makes it available to the rest of the
> compiler by moving the generalized version to emit-rtl.cc.
>
> We observe that the separate IN parameter to LRA's emit_inc is
> redundant, since the function is static and is only (statically) called
> once in lra-constraints.cc, with in == value.  As such, we drop the IN
> parameter and simplify the code accordingly.
>
> We wrap the emit_inc code in a virtual class to allow LRA to override
> how reload pseudos are created, thereby preserving the existing LRA
> behaviour as much as possible.
>
> We then add a second (higher-level) routine to emit-rtl.cc,
> force_reload_address, which can reload arbitrary addresses.  This uses
> the generalized emit_inc code to handle the RTX_AUTOINC case.  The
> second patch in this series uses force_reload_address to fix PR112906.
>
> Since we intend to call address_reload_context::emit_autoinc from within
> splitters, and the code lifted from LRA calls recog, we have to avoid
> clobbering recog_data.  We do this by introducing a new RAII class for
> saving/restoring recog_data on the stack.
>
> Bootstrapped/regtested on aarch64-linux-gnu, bootstrapped on
> x86_64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>       PR target/112906
>       * emit-rtl.cc (address_reload_context::emit_autoinc): New.
>       (force_reload_address): New.
>       * emit-rtl.h (struct address_reload_context): Declare.
>       (force_reload_address): Declare.
>       * lra-constraints.cc (class lra_autoinc_reload_context): New.
>       (emit_inc): Drop IN parameter, invoke
>       code moved to emit-rtl.cc:address_reload_context::emit_autoinc.
>       (curr_insn_transform): Drop redundant IN parameter in call to
>       emit_inc.
>       * recog.h (class recog_data_saver): New.


LGTM, and looks like it should functionally be a no-op for LRA.

So OK, but please give Vlad 24 hours to comment, or to ask for more time.

Thanks,
Richard

>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 4a7e420e7c0..ce7b98bf006 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "rtl-iter.h"
>  #include "stor-layout.h"
>  #include "opts.h"
> +#include "optabs.h"
>  #include "predict.h"
>  #include "rtx-vector-builder.h"
>  #include "gimple.h"
> @@ -2576,6 +2577,140 @@ replace_equiv_address_nv (rtx memref, rtx addr, bool 
> inplace)
>    return change_address_1 (memref, VOIDmode, addr, 0, inplace);
>  }
>  
> +
> +/* Emit insns to reload VALUE into a new register.  VALUE is an
> +   auto-increment or auto-decrement RTX whose operand is a register or
> +   memory location; so reloading involves incrementing that location.
> +
> +   INC_AMOUNT is the number to increment or decrement by (always
> +   positive and ignored for POST_MODIFY/PRE_MODIFY).
> +
> +   Return a pseudo containing the result.  */
> +rtx
> +address_reload_context::emit_autoinc (rtx value, poly_int64 inc_amount)
> +{
> +  /* Since we're going to call recog, and might be called within recog,
> +     we need to ensure we save and restore recog_data.  */
> +  recog_data_saver recog_save;
> +
> +  /* REG or MEM to be copied and incremented.  */
> +  rtx incloc = XEXP (value, 0);
> +
> +  const rtx_code code = GET_CODE (value);
> +  const bool post_p
> +    = code == POST_DEC || code == POST_INC || code == POST_MODIFY;
> +
> +  bool plus_p = true;
> +  rtx inc;
> +  if (code == PRE_MODIFY || code == POST_MODIFY)
> +    {
> +      gcc_assert (GET_CODE (XEXP (value, 1)) == PLUS
> +               || GET_CODE (XEXP (value, 1)) == MINUS);
> +      gcc_assert (rtx_equal_p (XEXP (XEXP (value, 1), 0), XEXP (value, 0)));
> +      plus_p = GET_CODE (XEXP (value, 1)) == PLUS;
> +      inc = XEXP (XEXP (value, 1), 1);
> +    }
> +  else
> +    {
> +      if (code == PRE_DEC || code == POST_DEC)
> +     inc_amount = -inc_amount;
> +
> +      inc = gen_int_mode (inc_amount, GET_MODE (value));
> +    }
> +
> +  rtx result;
> +  if (!post_p && REG_P (incloc))
> +    result = incloc;
> +  else
> +    {
> +      result = get_reload_reg ();
> +      /* First copy the location to the result register.  */
> +      emit_insn (gen_move_insn (result, incloc));
> +    }
> +
> +  /* See if we can directly increment INCLOC.  */
> +  rtx_insn *last = get_last_insn ();
> +  rtx_insn *add_insn = emit_insn (plus_p
> +                               ? gen_add2_insn (incloc, inc)
> +                               : gen_sub2_insn (incloc, inc));
> +  const int icode = recog_memoized (add_insn);
> +  if (icode >= 0)
> +    {
> +      if (!post_p && result != incloc)
> +     emit_insn (gen_move_insn (result, incloc));
> +      return result;
> +    }
> +  delete_insns_since (last);
> +
> +  /* If couldn't do the increment directly, must increment in RESULT.
> +     The way we do this depends on whether this is pre- or
> +     post-increment.  For pre-increment, copy INCLOC to the reload
> +     register, increment it there, then save back.  */
> +  if (!post_p)
> +    {
> +      if (incloc != result)
> +     emit_insn (gen_move_insn (result, incloc));
> +      if (plus_p)
> +     emit_insn (gen_add2_insn (result, inc));
> +      else
> +     emit_insn (gen_sub2_insn (result, inc));
> +      if (incloc != result)
> +     emit_insn (gen_move_insn (incloc, result));
> +    }
> +  else
> +    {
> +      /* Post-increment.
> +
> +      Because this might be a jump insn or a compare, and because
> +      RESULT may not be available after the insn in an input
> +      reload, we must do the incrementing before the insn being
> +      reloaded for.
> +
> +      We have already copied INCLOC to RESULT.  Increment the copy in
> +      RESULT, save that back, then decrement RESULT so it has
> +      the original value.  */
> +      if (plus_p)
> +     emit_insn (gen_add2_insn (result, inc));
> +      else
> +     emit_insn (gen_sub2_insn (result, inc));
> +      emit_insn (gen_move_insn (incloc, result));
> +      /* Restore non-modified value for the result.  We prefer this
> +      way because it does not require an additional hard
> +      register.  */
> +      if (plus_p)
> +     {
> +       poly_int64 offset;
> +       if (poly_int_rtx_p (inc, &offset))
> +         emit_insn (gen_add2_insn (result,
> +                                   gen_int_mode (-offset,
> +                                                 GET_MODE (result))));
> +       else
> +         emit_insn (gen_sub2_insn (result, inc));
> +     }
> +      else
> +     emit_insn (gen_add2_insn (result, inc));
> +    }
> +  return result;
> +}
> +
> +/* Return a memory reference like MEM, but with the address reloaded into a
> +   pseudo register.  */
> +
> +rtx
> +force_reload_address (rtx mem)
> +{
> +  rtx addr = XEXP (mem, 0);
> +  if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC)
> +    {
> +      const auto size = GET_MODE_SIZE (GET_MODE (mem));
> +      addr = address_reload_context ().emit_autoinc (addr, size);
> +    }
> +  else
> +    addr = force_reg (Pmode, addr);
> +
> +  return replace_equiv_address (mem, addr);
> +}
> +
>  /* Return a memory reference like MEMREF, but with its mode widened to
>     MODE and offset by OFFSET.  This would be used by targets that e.g.
>     cannot issue QImode memory operations and have to use SImode memory
> diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h
> index c11a6f3eb62..0829851ed66 100644
> --- a/gcc/emit-rtl.h
> +++ b/gcc/emit-rtl.h
> @@ -519,6 +519,28 @@ extern rtx adjust_address_1 (rtx, machine_mode, 
> poly_int64, int, int,
>  extern rtx adjust_automodify_address_1 (rtx, machine_mode, rtx,
>                                       poly_int64, int);
>  
> +/* Class wrapping emit_autoinc which allows derived classes to control
> +   how reload pseudos are created.  */
> +struct address_reload_context
> +{
> +  /* Can be overriden by derived classes.  */
> +  virtual rtx get_reload_reg () const { return gen_reg_rtx (Pmode); }
> +
> +  /* Emit insns to reload VALUE into a new register.  VALUE is an
> +     auto-increment or auto-decrement RTX whose operand is a register or
> +     memory location; so reloading involves incrementing that location.
> +
> +     AMOUNT is the number to increment or decrement by (always
> +     positive and ignored for POST_MODIFY/PRE_MODIFY).
> +
> +     Return a pseudo containing the result.  */
> +  rtx emit_autoinc (rtx value, poly_int64 amount);
> +};
> +
> +/* Return a memory reference like MEM, but with the address reloaded into a
> +   pseudo register.  */
> +extern rtx force_reload_address (rtx mem);
> +
>  /* Return a memory reference like MEMREF, but whose address is changed by
>     adding OFFSET, an RTX, to it.  POW2 is the highest power of two factor
>     known to be in OFFSET (possibly 1).  */
> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
> index 177c765ca13..da7e1748d75 100644
> --- a/gcc/lra-constraints.cc
> +++ b/gcc/lra-constraints.cc
> @@ -3951,133 +3951,36 @@ process_address (int nop, bool check_only_p,
>    return res;
>  }
>  
> +/* Override the generic address_reload_context in order to
> +   control the creation of reload pseudos.  */
> +class lra_autoinc_reload_context : public address_reload_context
> +{
> +  machine_mode mode;
> +  enum reg_class rclass;
> +
> +public:
> +  lra_autoinc_reload_context (machine_mode mode, enum reg_class new_rclass)
> +    : mode (mode), rclass (new_rclass) {}
> +
> +  rtx get_reload_reg () const override final
> +  {
> +    return lra_create_new_reg (mode, NULL_RTX, rclass, NULL, "INC/DEC 
> result");
> +  }
> +};
> +
>  /* Emit insns to reload VALUE into a new register.  VALUE is an
>     auto-increment or auto-decrement RTX whose operand is a register or
>     memory location; so reloading involves incrementing that location.
> -   IN is either identical to VALUE, or some cheaper place to reload
> -   value being incremented/decremented from.
>  
>     INC_AMOUNT is the number to increment or decrement by (always
>     positive and ignored for POST_MODIFY/PRE_MODIFY).
>  
> -   Return pseudo containing the result.       */
> +   Return a pseudo containing the result.  */
>  static rtx
> -emit_inc (enum reg_class new_rclass, rtx in, rtx value, poly_int64 
> inc_amount)
> +emit_inc (enum reg_class new_rclass, rtx value, poly_int64 inc_amount)
>  {
> -  /* REG or MEM to be copied and incremented.  */
> -  rtx incloc = XEXP (value, 0);
> -  /* Nonzero if increment after copying.  */
> -  int post = (GET_CODE (value) == POST_DEC || GET_CODE (value) == POST_INC
> -           || GET_CODE (value) == POST_MODIFY);
> -  rtx_insn *last;
> -  rtx inc;
> -  rtx_insn *add_insn;
> -  int code;
> -  rtx real_in = in == value ? incloc : in;
> -  rtx result;
> -  bool plus_p = true;
> -
> -  if (GET_CODE (value) == PRE_MODIFY || GET_CODE (value) == POST_MODIFY)
> -    {
> -      lra_assert (GET_CODE (XEXP (value, 1)) == PLUS
> -               || GET_CODE (XEXP (value, 1)) == MINUS);
> -      lra_assert (rtx_equal_p (XEXP (XEXP (value, 1), 0), XEXP (value, 0)));
> -      plus_p = GET_CODE (XEXP (value, 1)) == PLUS;
> -      inc = XEXP (XEXP (value, 1), 1);
> -    }
> -  else
> -    {
> -      if (GET_CODE (value) == PRE_DEC || GET_CODE (value) == POST_DEC)
> -     inc_amount = -inc_amount;
> -
> -      inc = gen_int_mode (inc_amount, GET_MODE (value));
> -    }
> -
> -  if (! post && REG_P (incloc))
> -    result = incloc;
> -  else
> -    result = lra_create_new_reg (GET_MODE (value), value, new_rclass, NULL,
> -                              "INC/DEC result");
> -
> -  if (real_in != result)
> -    {
> -      /* First copy the location to the result register.  */
> -      lra_assert (REG_P (result));
> -      emit_insn (gen_move_insn (result, real_in));
> -    }
> -
> -  /* We suppose that there are insns to add/sub with the constant
> -     increment permitted in {PRE/POST)_{DEC/INC/MODIFY}.  At least the
> -     old reload worked with this assumption.  If the assumption
> -     becomes wrong, we should use approach in function
> -     base_plus_disp_to_reg.  */
> -  if (in == value)
> -    {
> -      /* See if we can directly increment INCLOC.  */
> -      last = get_last_insn ();
> -      add_insn = emit_insn (plus_p
> -                         ? gen_add2_insn (incloc, inc)
> -                         : gen_sub2_insn (incloc, inc));
> -
> -      code = recog_memoized (add_insn);
> -      if (code >= 0)
> -     {
> -       if (! post && result != incloc)
> -         emit_insn (gen_move_insn (result, incloc));
> -       return result;
> -     }
> -      delete_insns_since (last);
> -    }
> -
> -  /* If couldn't do the increment directly, must increment in RESULT.
> -     The way we do this depends on whether this is pre- or
> -     post-increment.  For pre-increment, copy INCLOC to the reload
> -     register, increment it there, then save back.  */
> -  if (! post)
> -    {
> -      if (real_in != result)
> -     emit_insn (gen_move_insn (result, real_in));
> -      if (plus_p)
> -     emit_insn (gen_add2_insn (result, inc));
> -      else
> -     emit_insn (gen_sub2_insn (result, inc));
> -      if (result != incloc)
> -     emit_insn (gen_move_insn (incloc, result));
> -    }
> -  else
> -    {
> -      /* Post-increment.
> -
> -      Because this might be a jump insn or a compare, and because
> -      RESULT may not be available after the insn in an input
> -      reload, we must do the incrementing before the insn being
> -      reloaded for.
> -
> -      We have already copied IN to RESULT.  Increment the copy in
> -      RESULT, save that back, then decrement RESULT so it has
> -      the original value.  */
> -      if (plus_p)
> -     emit_insn (gen_add2_insn (result, inc));
> -      else
> -     emit_insn (gen_sub2_insn (result, inc));
> -      emit_insn (gen_move_insn (incloc, result));
> -      /* Restore non-modified value for the result.  We prefer this
> -      way because it does not require an additional hard
> -      register.  */
> -      if (plus_p)
> -     {
> -       poly_int64 offset;
> -       if (poly_int_rtx_p (inc, &offset))
> -         emit_insn (gen_add2_insn (result,
> -                                   gen_int_mode (-offset,
> -                                                 GET_MODE (result))));
> -       else
> -         emit_insn (gen_sub2_insn (result, inc));
> -     }
> -      else
> -     emit_insn (gen_add2_insn (result, inc));
> -    }
> -  return result;
> +  lra_autoinc_reload_context context (GET_MODE (value), new_rclass);
> +  return context.emit_autoinc (value, inc_amount);
>  }
>  
>  /* Return true if the current move insn does not need processing as we
> @@ -4669,7 +4572,7 @@ curr_insn_transform (bool check_only_p)
>         rclass = base_reg_class (GET_MODE (op), MEM_ADDR_SPACE (op),
>                                  MEM, SCRATCH, curr_insn);
>         if (GET_RTX_CLASS (code) == RTX_AUTOINC)
> -         new_reg = emit_inc (rclass, *loc, *loc,
> +         new_reg = emit_inc (rclass, *loc,
>                               /* This value does not matter for MODIFY.  */
>                               GET_MODE_SIZE (GET_MODE (op)));
>         else if (get_reload_reg (OP_IN, Pmode, *loc, rclass,
> diff --git a/gcc/recog.h b/gcc/recog.h
> index 5c801e7bb81..bf0b809a44b 100644
> --- a/gcc/recog.h
> +++ b/gcc/recog.h
> @@ -398,6 +398,16 @@ struct recog_data_d
>  
>  extern struct recog_data_d recog_data;
>  
> +/* RAII class for saving/restoring recog_data.  */
> +
> +class recog_data_saver
> +{
> +  recog_data_d m_saved_data;
> +public:
> +  recog_data_saver () : m_saved_data (recog_data) {}
> +  ~recog_data_saver () { recog_data = m_saved_data; }
> +};
> +
>  #ifndef GENERATOR_FILE
>  extern const operand_alternative *recog_op_alt;
>

Re: [PATCH 1/2] emit-rtl, lra: Move lra's emit_inc to emit-rtl.cc

Reply via email to