17 Regression] de-optimized synthesis of long long shift and add

cvs-commit at gcc dot gnu.org via Gcc-bugs Thu, 07 May 2026 10:49:29 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122871


--- Comment #10 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Roger Sayle <[email protected]>:

https://gcc.gnu.org/g:1a06a37611e3b27889c595a17df13f6d27202a95

commit r17-383-g1a06a37611e3b27889c595a17df13f6d27202a95
Author: Roger Sayle <[email protected]>
Date:   Thu May 7 18:46:37 2026 +0100

    PR middle-end/122871: Doubleword multiplication improvements

    This patch resolves PR middle-end/122871 by improving RTL expansion of
    doubleword multiplications.  The main change is to synth_mult adding
    support for the case where the constant being multiplied has BITS_PER_WORD
    or more trailing zeros.  The shift_cost tables in expmed are only
    parameterized for shifts less than BITS_PER_WORD, so doubleword shifts
    by more than this can't use the usual code path. This patch teaches
    synth_mult that for scalar doubleword multiplications, a doubleword shift
    by more than BITS_PER_WORD typically requires two instructions; one to
    set the result lowpart to zero, and the other a wordmode shift to
    calculate the result highpart.

    For the testcase given in the PR:

    long long ashll_fn (long long a)
    {
      long long c;

      c = a << 33;
      c += a;
      return c;
    }

    GCC for arm-linux-gnueabihf currently generates with -O2:

    ashll_fn:
            lsl     r2, r1, #11
            lsl     ip, r0, #11
            subs    ip, ip, r0
            orr     r2, r2, r0, lsr #21
            sbc     r2, r2, r1
            lsl     r3, ip, #11
            lsl     r2, r2, #11
            adds    r3, r3, r0
            orr     r2, r2, ip, lsr #21
            adc     r1, r1, r2
            lsl     r2, r1, #11
            lsl     r0, r3, #11
            adds    r0, r3, r0
            orr     r2, r2, r3, lsr #21
            adc     r1, r1, r2
            bx      lr

    with this patch, we instead generate:

    ashll_fn:
            add     r1, r1, r0, lsl #1
            bx      lr

    Additionally, this patch includes a clean-up (identified by A. Pinski)
    to prevent RTL expansion of doubleword multiplications from
    initially emitting multiply instructions by immediate constants 0, 1
    or 2.  These dubious multiplications eventually get tidied up by later
    RTL optimization passes, but being sensible during RTL expansion
    both speeds up the compiler and reduces unnecessary memory usage.

    2026-05-07  Roger Sayle  <[email protected]>

    gcc/ChangeLog
            PR middle-end/122871
            * expmed.cc (synth_mult): Handle doubleword left shifts by
            BITS_PER_WORD bits or more, for scalar modes.
            * optabs.cc (expand_doubleword_mult): Avoid generating multiply
            instructions by immediate constants 0, 1 or 2.

    gcc/testsuite/ChangeLog
            PR middle-end/122871
            * gcc.target/arm/muldi-1.c: New test case.

[Bug middle-end/122871] [13/14/15/16/17 Regression] de-optimized synthesis of long long shift and add

Reply via email to