https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109476

--- Comment #18 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <sa...@gcc.gnu.org>:

https://gcc.gnu.org/g:650c36ec461a722d9c65e82512b4c3aeec2ffee1

commit r14-335-g650c36ec461a722d9c65e82512b4c3aeec2ffee1
Author: Roger Sayle <ro...@nextmovesoftware.com>
Date:   Fri Apr 28 14:21:53 2023 +0100

    PR rtl-optimization/109476: Use ZERO_EXTEND instead of zeroing a SUBREG.

    This patch fixes PR rtl-optimization/109476, which is a code quality
    regression affecting AVR.  The cause is that the lower-subreg pass is
    sometimes overly aggressive, lowering the LSHIFTRT below:

    (insn 7 4 8 2 (set (reg:HI 51)
            (lshiftrt:HI (reg/v:HI 49 [ b ])
                (const_int 8 [0x8]))) "t.ii":4:36 557 {lshrhi3}
         (nil))

    into a pair of QImode SUBREG assignments:

    (insn 19 4 20 2 (set (subreg:QI (reg:HI 51) 0)
            (reg:QI 54 [ b+1 ])) "t.ii":4:36 86 {movqi_insn_split}
         (nil))
    (insn 20 19 8 2 (set (subreg:QI (reg:HI 51) 1)
            (const_int 0 [0])) "t.ii":4:36 86 {movqi_insn_split}
         (nil))

    but this idiom, SETs of SUBREGs, interferes with combine's ability
    to associate/fuse instructions.  The solution, on targets that
    have a suitable ZERO_EXTEND (i.e. where the lower-subreg pass
    wouldn't itself split a ZERO_EXTEND, so "splitting_zext" is false),
    is to split/lower LSHIFTRT to a ZERO_EXTEND.

    To answer Richard's question in comment #10 of the bugzilla PR,
    the function resolve_shift_zext is called with one of four RTX
    codes, ASHIFTRT, LSHIFTRT, ZERO_EXTEND and ASHIFT, but only with
    LSHIFTRT can the setting of low_part and high_part SUBREGs be
    replaced by a ZERO_EXTEND.  For ASHIFTRT, we require a sign
    extension, so don't set the high_part to zero; if we're splitting
    a ZERO_EXTEND then it doesn't make sense to replace it with a
    ZERO_EXTEND, and for ASHIFT we've played games to swap the
    high_part and low_part SUBREGs, so that we assign the low_part
    to zero (for double word shifts by greater than word size bits).

    2023-04-28  Roger Sayle  <ro...@nextmovesoftware.com>

    gcc/ChangeLog
            PR rtl-optimization/109476
            * lower-subreg.cc: Include explow.h for force_reg.
            (find_decomposable_shift_zext): Pass an additional SPEED_P
argument.
            If decomposing a suitable LSHIFTRT and we're not splitting
            ZERO_EXTEND (based on the current SPEED_P), then use a ZERO_EXTEND
            instead of setting a high part SUBREG to zero, which helps combine.
            (decompose_multiword_subregs): Update call to resolve_shift_zext.

    gcc/testsuite/ChangeLog
            PR rtl-optimization/109476
            * gcc.target/avr/mmcu/pr109476.c: New test case.

Reply via email to