On 6/9/25 12:40 PM, Dimitar Dimitrov wrote:
On Sun, Jun 08, 2025 at 09:09:44AM -0600, Jeff Law wrote:


On 6/5/25 2:16 PM, Dimitar Dimitrov wrote:
PR119966 showed that combine could generate unfoldable hardware subregs
for pru-unknown-elf.  To fix, strengthen the checks performed by
validate_subreg.

The simplify_subreg_regno performs more validity checks than
the simple info.representable_p.  Most importantly, the
targetm.hard_regno_mode_ok hook is called to ensure the hardware
register is valid in subreg's outer mode.  This fixes the rootcause
for PR119966.

The checks for stack-related registers are bypassed because the i386
backend generates them, in this seemingly valid peephole optimization:

     ;; Attempt to always use XOR for zeroing registers (including FP modes).
     (define_peephole2
       [(set (match_operand 0 "general_reg_operand")
             (match_operand 1 "const0_operand"))]
       "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
        && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
        && peep2_regno_dead_p (0, FLAGS_REG)"
       [(parallel [(set (match_dup 0) (const_int 0))
                   (clobber (reg:CC FLAGS_REG))])]
       "operands[0] = gen_lowpart (word_mode, operands[0]);")

Testing done:
    * No regressions were detected for C and C++ on x86_64-pc-linux-gnu.
    * "contrib/compare-all-tests i386" showed no difference in code
      generation.
    * No regressions for pru-unknown-elf.
    * Reverted r16-809-gf725d6765373f7 to expose the now latent PR119966.
      Then ensured pru-unknown-elf build is ok.  Only two cases regressed
      where rnreg pass transforms a valid hardware subreg into invalid
      one.  But I think that is not related to combine's PR119966:
        gcc.c-torture/execute/20040709-1.c
        gcc.c-torture/execute/20040709-2.c

This patch was provisionally approved in:
    https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685492.html
I'll wait for 2 days to get pre-commit CI results, then will commit it.

        PR target/119966

gcc/ChangeLog:

        * emit-rtl.cc (validate_subreg): Call simplify_subreg_regno
        instead of checking info.representable_p..
        * rtl.h (simplify_subreg_regno): Add new argument
        allow_stack_regs.
        * rtlanal.cc (simplify_subreg_regno): Do not reject
        stack-related registers if allow_stack_regs is true.
Note that my system is primarily post-commit unless I
explicitly throw in a patch for testing purposes.  So
figure ~24hrs after the patch goes in we'll have the
vast majority of the coverage.  The native emulated
targets will trickle in over the remainder of the week.


I did more testing.  riscv32-unknown-elf and riscv64-unknown-linux-gnu
are ok.  But this patch breaks AVR due to the following workaround for
old reload in avr_hard_regno_mode_ok:

   if (GET_MODE_SIZE (mode) >= 4
       && regno >= REG_X
       // This problem only concerned the old reload.
       && ! avropt_lra_p)
     return false;

The following insn causes a split for "zero_extendsidi2" to ICE because
operands are replaced with output from simplify_gen_subreg, which now
fails:

   (insn 17 16 18 2 (set (reg/v:DI 24 r24 [orig:44 lowD.4526 ] [44])
           (zero_extend:DI (reg:SI 22 r22 [orig:66 aD.4519 ] [66]))) 
"/mnt/nvme/dinux/local-workspace/gcc/libgcc/fixed-bit.c":790:7 827 
{zero_extendsidi2}
        (nil))

That should be fixed once AVR is switched to LRA (PR113934).

Richard, Jeff, it does not seem appropriate to merge this patch now,
given that it breaks avr and or1k.  Let me know if such experiment is
desired despite the known breakages.
It's a bit of a judgment call -- we don't require changes to be clean across every target, that would be an undue testing burden on contributors.

But knowing it's causing an avr fail means we definitely need to be a bit more cautious. An argument could be made that this is a bug in the avr port -- the code in question looks quite bogus and the fact that the "fix" is to stop using reload.

I think a reasonable step forward would be to put this into my tester and see if anything else pops out. If nothing else pops out then I would suggest we flip the AVR port to LRA and install the patch. If other things pop out, then we'll have to revise the plan.

So I'll throw it into my tester. We'll have data on the crosses tomorrow AM my time.

Jeff

Reply via email to