https://gcc.gnu.org/bugzilla/show_bug.cgi?id=125794

--- Comment #4 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Kyrylo Tkachov <[email protected]>:

https://gcc.gnu.org/g:667a1b5285c2d5d075a324a29323804f93d43add

commit r17-1584-g667a1b5285c2d5d075a324a29323804f93d43add
Author: Kyrylo Tkachov <[email protected]>
Date:   Tue Jun 16 00:58:28 2026 -0700

    aarch64: Fix wrong code for high-64-zero Advanced SIMD constants [PR125794]

    r17-1491-gf152cf1734f808 (PR113926) taught aarch64_simd_valid_imm to
    materialize a 128-bit Advanced SIMD MOV constant whose high 64 bits are
    zero with a 64-bit MOVI/FMOV, which zeroes the upper half of the
    register.  It records this with simd_immediate_info::width == 64
    (output_width).

    However, when the low 64 bits are not themselves a valid Advanced SIMD
    (MOVI/MVNI/FMOV) immediate, the function fell through to the SVE
    immediate forms (aarch64_sve_valid_immediate).  Those use a replicating
    "mov zN.<T>, #imm", which sets the whole vector, including the high 64
    bits that were required to be zero, to the repeated low-64-bit value.
    For e.g. the V4SI constant { 0, 1, 0, 0 } this emitted

            mov     z31.d, #4294967296      // 0x100000000, i.e. { 0, 1, 0, 1 }

    instead of the intended { 0, 1, 0, 0 }, producing wrong code.

    Fix it by not falling through to the SVE forms when output_width is set:
    such a constant must be formed by a 64-bit Advanced SIMD MOVI/FMOV
    (handled by the Advanced SIMD and floating-point paths just above) or
    not at all, in which case the caller materializes it some other way
    (e.g. a literal-pool load), which is the pre-r17-1491 behavior for these
    constants.

    The PR113926 optimization is unaffected: it only applies when the
    Advanced SIMD or floating-point path accepts the low 64 bits, and those
    still return true before the new check.

    Bootstrapped and regression-tested on aarch64-linux-gnu.
    Pushing to trunk.

    Signed-off-by: Kyrylo Tkachov <[email protected]>

    gcc/ChangeLog:

            PR target/125794
            * config/aarch64/aarch64.cc (aarch64_simd_valid_imm): Do not fall
            through to the replicating SVE immediate forms for a 128-bit
            Advanced SIMD constant whose high 64 bits are zero (output_width
            != 0).

    gcc/testsuite/ChangeLog:

            PR target/125794
            * gcc.target/aarch64/sve/pr125794.c: New test.

Reply via email to