Alex Coplan <alex.cop...@arm.com> writes: > Hi, > > In PR112906 we ICE because we try to use force_reg to reload an > auto-increment address, but force_reg can't do this. > > With the aim of fixing the PR by supporting reloading arbitrary > addresses in pre-RA splitters, this patch generalizes > lra-constraints.cc:emit_inc and makes it available to the rest of the > compiler by moving the generalized version to emit-rtl.cc. > > We observe that the separate IN parameter to LRA's emit_inc is > redundant, since the function is static and is only (statically) called > once in lra-constraints.cc, with in == value. As such, we drop the IN > parameter and simplify the code accordingly. > > We wrap the emit_inc code in a virtual class to allow LRA to override > how reload pseudos are created, thereby preserving the existing LRA > behaviour as much as possible. > > We then add a second (higher-level) routine to emit-rtl.cc, > force_reload_address, which can reload arbitrary addresses. This uses > the generalized emit_inc code to handle the RTX_AUTOINC case. The > second patch in this series uses force_reload_address to fix PR112906. > > Since we intend to call address_reload_context::emit_autoinc from within > splitters, and the code lifted from LRA calls recog, we have to avoid > clobbering recog_data. We do this by introducing a new RAII class for > saving/restoring recog_data on the stack. > > Bootstrapped/regtested on aarch64-linux-gnu, bootstrapped on > x86_64-linux-gnu, OK for trunk? > > Thanks, > Alex > > gcc/ChangeLog: > > PR target/112906 > * emit-rtl.cc (address_reload_context::emit_autoinc): New. > (force_reload_address): New. > * emit-rtl.h (struct address_reload_context): Declare. > (force_reload_address): Declare. > * lra-constraints.cc (class lra_autoinc_reload_context): New. > (emit_inc): Drop IN parameter, invoke > code moved to emit-rtl.cc:address_reload_context::emit_autoinc. > (curr_insn_transform): Drop redundant IN parameter in call to > emit_inc. > * recog.h (class recog_data_saver): New.
LGTM, and looks like it should functionally be a no-op for LRA. So OK, but please give Vlad 24 hours to comment, or to ask for more time. Thanks, Richard > > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc > index 4a7e420e7c0..ce7b98bf006 100644 > --- a/gcc/emit-rtl.cc > +++ b/gcc/emit-rtl.cc > @@ -58,6 +58,7 @@ along with GCC; see the file COPYING3. If not see > #include "rtl-iter.h" > #include "stor-layout.h" > #include "opts.h" > +#include "optabs.h" > #include "predict.h" > #include "rtx-vector-builder.h" > #include "gimple.h" > @@ -2576,6 +2577,140 @@ replace_equiv_address_nv (rtx memref, rtx addr, bool > inplace) > return change_address_1 (memref, VOIDmode, addr, 0, inplace); > } > > + > +/* Emit insns to reload VALUE into a new register. VALUE is an > + auto-increment or auto-decrement RTX whose operand is a register or > + memory location; so reloading involves incrementing that location. > + > + INC_AMOUNT is the number to increment or decrement by (always > + positive and ignored for POST_MODIFY/PRE_MODIFY). > + > + Return a pseudo containing the result. */ > +rtx > +address_reload_context::emit_autoinc (rtx value, poly_int64 inc_amount) > +{ > + /* Since we're going to call recog, and might be called within recog, > + we need to ensure we save and restore recog_data. */ > + recog_data_saver recog_save; > + > + /* REG or MEM to be copied and incremented. */ > + rtx incloc = XEXP (value, 0); > + > + const rtx_code code = GET_CODE (value); > + const bool post_p > + = code == POST_DEC || code == POST_INC || code == POST_MODIFY; > + > + bool plus_p = true; > + rtx inc; > + if (code == PRE_MODIFY || code == POST_MODIFY) > + { > + gcc_assert (GET_CODE (XEXP (value, 1)) == PLUS > + || GET_CODE (XEXP (value, 1)) == MINUS); > + gcc_assert (rtx_equal_p (XEXP (XEXP (value, 1), 0), XEXP (value, 0))); > + plus_p = GET_CODE (XEXP (value, 1)) == PLUS; > + inc = XEXP (XEXP (value, 1), 1); > + } > + else > + { > + if (code == PRE_DEC || code == POST_DEC) > + inc_amount = -inc_amount; > + > + inc = gen_int_mode (inc_amount, GET_MODE (value)); > + } > + > + rtx result; > + if (!post_p && REG_P (incloc)) > + result = incloc; > + else > + { > + result = get_reload_reg (); > + /* First copy the location to the result register. */ > + emit_insn (gen_move_insn (result, incloc)); > + } > + > + /* See if we can directly increment INCLOC. */ > + rtx_insn *last = get_last_insn (); > + rtx_insn *add_insn = emit_insn (plus_p > + ? gen_add2_insn (incloc, inc) > + : gen_sub2_insn (incloc, inc)); > + const int icode = recog_memoized (add_insn); > + if (icode >= 0) > + { > + if (!post_p && result != incloc) > + emit_insn (gen_move_insn (result, incloc)); > + return result; > + } > + delete_insns_since (last); > + > + /* If couldn't do the increment directly, must increment in RESULT. > + The way we do this depends on whether this is pre- or > + post-increment. For pre-increment, copy INCLOC to the reload > + register, increment it there, then save back. */ > + if (!post_p) > + { > + if (incloc != result) > + emit_insn (gen_move_insn (result, incloc)); > + if (plus_p) > + emit_insn (gen_add2_insn (result, inc)); > + else > + emit_insn (gen_sub2_insn (result, inc)); > + if (incloc != result) > + emit_insn (gen_move_insn (incloc, result)); > + } > + else > + { > + /* Post-increment. > + > + Because this might be a jump insn or a compare, and because > + RESULT may not be available after the insn in an input > + reload, we must do the incrementing before the insn being > + reloaded for. > + > + We have already copied INCLOC to RESULT. Increment the copy in > + RESULT, save that back, then decrement RESULT so it has > + the original value. */ > + if (plus_p) > + emit_insn (gen_add2_insn (result, inc)); > + else > + emit_insn (gen_sub2_insn (result, inc)); > + emit_insn (gen_move_insn (incloc, result)); > + /* Restore non-modified value for the result. We prefer this > + way because it does not require an additional hard > + register. */ > + if (plus_p) > + { > + poly_int64 offset; > + if (poly_int_rtx_p (inc, &offset)) > + emit_insn (gen_add2_insn (result, > + gen_int_mode (-offset, > + GET_MODE (result)))); > + else > + emit_insn (gen_sub2_insn (result, inc)); > + } > + else > + emit_insn (gen_add2_insn (result, inc)); > + } > + return result; > +} > + > +/* Return a memory reference like MEM, but with the address reloaded into a > + pseudo register. */ > + > +rtx > +force_reload_address (rtx mem) > +{ > + rtx addr = XEXP (mem, 0); > + if (GET_RTX_CLASS (GET_CODE (addr)) == RTX_AUTOINC) > + { > + const auto size = GET_MODE_SIZE (GET_MODE (mem)); > + addr = address_reload_context ().emit_autoinc (addr, size); > + } > + else > + addr = force_reg (Pmode, addr); > + > + return replace_equiv_address (mem, addr); > +} > + > /* Return a memory reference like MEMREF, but with its mode widened to > MODE and offset by OFFSET. This would be used by targets that e.g. > cannot issue QImode memory operations and have to use SImode memory > diff --git a/gcc/emit-rtl.h b/gcc/emit-rtl.h > index c11a6f3eb62..0829851ed66 100644 > --- a/gcc/emit-rtl.h > +++ b/gcc/emit-rtl.h > @@ -519,6 +519,28 @@ extern rtx adjust_address_1 (rtx, machine_mode, > poly_int64, int, int, > extern rtx adjust_automodify_address_1 (rtx, machine_mode, rtx, > poly_int64, int); > > +/* Class wrapping emit_autoinc which allows derived classes to control > + how reload pseudos are created. */ > +struct address_reload_context > +{ > + /* Can be overriden by derived classes. */ > + virtual rtx get_reload_reg () const { return gen_reg_rtx (Pmode); } > + > + /* Emit insns to reload VALUE into a new register. VALUE is an > + auto-increment or auto-decrement RTX whose operand is a register or > + memory location; so reloading involves incrementing that location. > + > + AMOUNT is the number to increment or decrement by (always > + positive and ignored for POST_MODIFY/PRE_MODIFY). > + > + Return a pseudo containing the result. */ > + rtx emit_autoinc (rtx value, poly_int64 amount); > +}; > + > +/* Return a memory reference like MEM, but with the address reloaded into a > + pseudo register. */ > +extern rtx force_reload_address (rtx mem); > + > /* Return a memory reference like MEMREF, but whose address is changed by > adding OFFSET, an RTX, to it. POW2 is the highest power of two factor > known to be in OFFSET (possibly 1). */ > diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc > index 177c765ca13..da7e1748d75 100644 > --- a/gcc/lra-constraints.cc > +++ b/gcc/lra-constraints.cc > @@ -3951,133 +3951,36 @@ process_address (int nop, bool check_only_p, > return res; > } > > +/* Override the generic address_reload_context in order to > + control the creation of reload pseudos. */ > +class lra_autoinc_reload_context : public address_reload_context > +{ > + machine_mode mode; > + enum reg_class rclass; > + > +public: > + lra_autoinc_reload_context (machine_mode mode, enum reg_class new_rclass) > + : mode (mode), rclass (new_rclass) {} > + > + rtx get_reload_reg () const override final > + { > + return lra_create_new_reg (mode, NULL_RTX, rclass, NULL, "INC/DEC > result"); > + } > +}; > + > /* Emit insns to reload VALUE into a new register. VALUE is an > auto-increment or auto-decrement RTX whose operand is a register or > memory location; so reloading involves incrementing that location. > - IN is either identical to VALUE, or some cheaper place to reload > - value being incremented/decremented from. > > INC_AMOUNT is the number to increment or decrement by (always > positive and ignored for POST_MODIFY/PRE_MODIFY). > > - Return pseudo containing the result. */ > + Return a pseudo containing the result. */ > static rtx > -emit_inc (enum reg_class new_rclass, rtx in, rtx value, poly_int64 > inc_amount) > +emit_inc (enum reg_class new_rclass, rtx value, poly_int64 inc_amount) > { > - /* REG or MEM to be copied and incremented. */ > - rtx incloc = XEXP (value, 0); > - /* Nonzero if increment after copying. */ > - int post = (GET_CODE (value) == POST_DEC || GET_CODE (value) == POST_INC > - || GET_CODE (value) == POST_MODIFY); > - rtx_insn *last; > - rtx inc; > - rtx_insn *add_insn; > - int code; > - rtx real_in = in == value ? incloc : in; > - rtx result; > - bool plus_p = true; > - > - if (GET_CODE (value) == PRE_MODIFY || GET_CODE (value) == POST_MODIFY) > - { > - lra_assert (GET_CODE (XEXP (value, 1)) == PLUS > - || GET_CODE (XEXP (value, 1)) == MINUS); > - lra_assert (rtx_equal_p (XEXP (XEXP (value, 1), 0), XEXP (value, 0))); > - plus_p = GET_CODE (XEXP (value, 1)) == PLUS; > - inc = XEXP (XEXP (value, 1), 1); > - } > - else > - { > - if (GET_CODE (value) == PRE_DEC || GET_CODE (value) == POST_DEC) > - inc_amount = -inc_amount; > - > - inc = gen_int_mode (inc_amount, GET_MODE (value)); > - } > - > - if (! post && REG_P (incloc)) > - result = incloc; > - else > - result = lra_create_new_reg (GET_MODE (value), value, new_rclass, NULL, > - "INC/DEC result"); > - > - if (real_in != result) > - { > - /* First copy the location to the result register. */ > - lra_assert (REG_P (result)); > - emit_insn (gen_move_insn (result, real_in)); > - } > - > - /* We suppose that there are insns to add/sub with the constant > - increment permitted in {PRE/POST)_{DEC/INC/MODIFY}. At least the > - old reload worked with this assumption. If the assumption > - becomes wrong, we should use approach in function > - base_plus_disp_to_reg. */ > - if (in == value) > - { > - /* See if we can directly increment INCLOC. */ > - last = get_last_insn (); > - add_insn = emit_insn (plus_p > - ? gen_add2_insn (incloc, inc) > - : gen_sub2_insn (incloc, inc)); > - > - code = recog_memoized (add_insn); > - if (code >= 0) > - { > - if (! post && result != incloc) > - emit_insn (gen_move_insn (result, incloc)); > - return result; > - } > - delete_insns_since (last); > - } > - > - /* If couldn't do the increment directly, must increment in RESULT. > - The way we do this depends on whether this is pre- or > - post-increment. For pre-increment, copy INCLOC to the reload > - register, increment it there, then save back. */ > - if (! post) > - { > - if (real_in != result) > - emit_insn (gen_move_insn (result, real_in)); > - if (plus_p) > - emit_insn (gen_add2_insn (result, inc)); > - else > - emit_insn (gen_sub2_insn (result, inc)); > - if (result != incloc) > - emit_insn (gen_move_insn (incloc, result)); > - } > - else > - { > - /* Post-increment. > - > - Because this might be a jump insn or a compare, and because > - RESULT may not be available after the insn in an input > - reload, we must do the incrementing before the insn being > - reloaded for. > - > - We have already copied IN to RESULT. Increment the copy in > - RESULT, save that back, then decrement RESULT so it has > - the original value. */ > - if (plus_p) > - emit_insn (gen_add2_insn (result, inc)); > - else > - emit_insn (gen_sub2_insn (result, inc)); > - emit_insn (gen_move_insn (incloc, result)); > - /* Restore non-modified value for the result. We prefer this > - way because it does not require an additional hard > - register. */ > - if (plus_p) > - { > - poly_int64 offset; > - if (poly_int_rtx_p (inc, &offset)) > - emit_insn (gen_add2_insn (result, > - gen_int_mode (-offset, > - GET_MODE (result)))); > - else > - emit_insn (gen_sub2_insn (result, inc)); > - } > - else > - emit_insn (gen_add2_insn (result, inc)); > - } > - return result; > + lra_autoinc_reload_context context (GET_MODE (value), new_rclass); > + return context.emit_autoinc (value, inc_amount); > } > > /* Return true if the current move insn does not need processing as we > @@ -4669,7 +4572,7 @@ curr_insn_transform (bool check_only_p) > rclass = base_reg_class (GET_MODE (op), MEM_ADDR_SPACE (op), > MEM, SCRATCH, curr_insn); > if (GET_RTX_CLASS (code) == RTX_AUTOINC) > - new_reg = emit_inc (rclass, *loc, *loc, > + new_reg = emit_inc (rclass, *loc, > /* This value does not matter for MODIFY. */ > GET_MODE_SIZE (GET_MODE (op))); > else if (get_reload_reg (OP_IN, Pmode, *loc, rclass, > diff --git a/gcc/recog.h b/gcc/recog.h > index 5c801e7bb81..bf0b809a44b 100644 > --- a/gcc/recog.h > +++ b/gcc/recog.h > @@ -398,6 +398,16 @@ struct recog_data_d > > extern struct recog_data_d recog_data; > > +/* RAII class for saving/restoring recog_data. */ > + > +class recog_data_saver > +{ > + recog_data_d m_saved_data; > +public: > + recog_data_saver () : m_saved_data (recog_data) {} > + ~recog_data_saver () { recog_data = m_saved_data; } > +}; > + > #ifndef GENERATOR_FILE > extern const operand_alternative *recog_op_alt; >