https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106253

--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The trunk branch has been updated by Richard Sandiford <rsand...@gcc.gnu.org>:

https://gcc.gnu.org/g:7313381d2ce44b72b4c9f70bd5670e5d78d1f631

commit r13-1730-g7313381d2ce44b72b4c9f70bd5670e5d78d1f631
Author: Richard Sandiford <richard.sandif...@arm.com>
Date:   Mon Jul 18 12:57:10 2022 +0100

    arm: Replace arm_builtin_vectorized_function [PR106253]

    This patch extends the fix for PR106253 to AArch32.  As with AArch64,
    we were using ACLE intrinsics to vectorise scalar built-ins, even
    though the two sometimes have different ECF_* flags.  (That in turn
    is because the ACLE intrinsics should follow the instruction semantics
    as closely as possible, whereas the scalar built-ins follow language
    specs.)

    The patch also removes the copysignf built-in, which only existed
    for this purpose and wasn't a ârealâ arm_neon.h built-in.

    Doing this also has the side-effect of enabling vectorisation of
    rint and roundeven.  Logically that should be a separate patch,
    but making it one would have meant adding a new int iterator
    for the original set of instructions and then removing it again
    when including new functions.

    I've restricted the bswap tests to little-endian because we end
    up with excessive spilling on big-endian.  E.g.:

            sub     sp, sp, #8
            vstr    d1, [sp]
            vldr    d16, [sp]
            vrev16.8        d16, d16
            vstr    d16, [sp]
            vldr    d0, [sp]
            add     sp, sp, #8
            @ sp needed
            bx      lr

    Similarly, the copysign tests require little-endian because on
    big-endian we unnecessarily load the constant from the constant pool:

            vldr.32 s15, .L3
            vdup.32 d0, d7[1]
            vbsl    d0, d2, d1
            bx      lr
    .L3:
            .word   -2147483648

    gcc/
            PR target/106253
            * config/arm/arm-builtins.cc (arm_builtin_vectorized_function):
            Delete.
            * config/arm/arm-protos.h (arm_builtin_vectorized_function):
Delete.
            * config/arm/arm.cc (TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION):
            Delete.
            * config/arm/arm_neon_builtins.def (copysignf): Delete.
            * config/arm/iterators.md (nvrint_pattern): New attribute.
            * config/arm/neon.md (<NEON_VRINT:nvrint_pattern><VCVTF:mode>2):
            New pattern.
            (l<NEON_VCVT:nvrint_pattern><su_optab><VCVTF:mode><v_cmp_result>2):
            Likewise.
            (neon_copysignf<mode>): Rename to...
            (copysign<mode>3): ...this.

    gcc/testsuite/
            PR target/106253
            * gcc.target/arm/vect_unary_1.c: New test.
            * gcc.target/arm/vect_binary_1.c: Likewise.

Reply via email to