On 5/2/23 17:28, Kyrylo Tkachov wrote:


-----Original Message-----
From: Christophe Lyon <christophe.l...@arm.com>
Sent: Tuesday, May 2, 2023 3:05 PM
To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-patches@gcc.gnu.org;
Richard Earnshaw <richard.earns...@arm.com>; Richard Sandiford
<richard.sandif...@arm.com>
Subject: Re: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq




On 5/2/23 12:26, Kyrylo Tkachov wrote:


        Hi Christophe,


                -----Original Message-----
                From: Christophe Lyon <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>
                Sent: Tuesday, April 18, 2023 2:46 PM
                To: gcc-patches@gcc.gnu.org <mailto:gcc-
patc...@gcc.gnu.org> ; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
<mailto:kyrylo.tkac...@arm.com> ;
                Richard Earnshaw <richard.earns...@arm.com>
<mailto:richard.earns...@arm.com> ; Richard Sandiford
                <richard.sandif...@arm.com>
<mailto:richard.sandif...@arm.com>
                Cc: Christophe Lyon <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>
                Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework
vreinterpretq

                This patch implements vreinterpretq using the new MVE
intrinsics
                framework.

                The old definitions for vreinterpretq are removed as a
consequence.

                2022-09-08  Murray Steele  <murray.ste...@arm.com>
<mailto:murray.ste...@arm.com>
                            Christophe Lyon  <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>

                        gcc/
                        * config/arm/arm-mve-builtins-base.cc
(vreinterpretq_impl): New
                class.
                        * config/arm/arm-mve-builtins-base.def: Define
vreinterpretq.
                        * config/arm/arm-mve-builtins-base.h
(vreinterpretq): New
                declaration.
                        * config/arm/arm-mve-builtins-shapes.cc
(parse_element_type): New
                function.
                        (parse_type): Likewise.
                        (parse_signature): Likewise.
                        (build_one): Likewise.
                        (build_all): Likewise.
                        (overloaded_base): New struct.
                        (unary_convert_def): Likewise.
                        * config/arm/arm-mve-builtins-shapes.h
(unary_convert): Declare.
                        * config/arm/arm-mve-builtins.cc
(TYPES_reinterpret_signed1): New
                        macro.
                        (TYPES_reinterpret_unsigned1): Likewise.
                        (TYPES_reinterpret_integer): Likewise.
                        (TYPES_reinterpret_integer1): Likewise.
                        (TYPES_reinterpret_float1): Likewise.
                        (TYPES_reinterpret_float): Likewise.
                        (reinterpret_integer): New.
                        (reinterpret_float): New.
                        (handle_arm_mve_h): Register builtins.
                        * config/arm/arm_mve.h (vreinterpretq_s16):
Remove.
                        (vreinterpretq_s32): Likewise.
                        (vreinterpretq_s64): Likewise.
                        (vreinterpretq_s8): Likewise.
                        (vreinterpretq_u16): Likewise.
                        (vreinterpretq_u32): Likewise.
                        (vreinterpretq_u64): Likewise.
                        (vreinterpretq_u8): Likewise.
                        (vreinterpretq_f16): Likewise.
                        (vreinterpretq_f32): Likewise.
                        (vreinterpretq_s16_s32): Likewise.
                        (vreinterpretq_s16_s64): Likewise.
                        (vreinterpretq_s16_s8): Likewise.
                        (vreinterpretq_s16_u16): Likewise.
                        (vreinterpretq_s16_u32): Likewise.
                        (vreinterpretq_s16_u64): Likewise.
                        (vreinterpretq_s16_u8): Likewise.
                        (vreinterpretq_s32_s16): Likewise.
                        (vreinterpretq_s32_s64): Likewise.
                        (vreinterpretq_s32_s8): Likewise.
                        (vreinterpretq_s32_u16): Likewise.
                        (vreinterpretq_s32_u32): Likewise.
                        (vreinterpretq_s32_u64): Likewise.
                        (vreinterpretq_s32_u8): Likewise.
                        (vreinterpretq_s64_s16): Likewise.
                        (vreinterpretq_s64_s32): Likewise.
                        (vreinterpretq_s64_s8): Likewise.
                        (vreinterpretq_s64_u16): Likewise.
                        (vreinterpretq_s64_u32): Likewise.
                        (vreinterpretq_s64_u64): Likewise.
                        (vreinterpretq_s64_u8): Likewise.
                        (vreinterpretq_s8_s16): Likewise.
                        (vreinterpretq_s8_s32): Likewise.
                        (vreinterpretq_s8_s64): Likewise.
                        (vreinterpretq_s8_u16): Likewise.
                        (vreinterpretq_s8_u32): Likewise.
                        (vreinterpretq_s8_u64): Likewise.
                        (vreinterpretq_s8_u8): Likewise.
                        (vreinterpretq_u16_s16): Likewise.
                        (vreinterpretq_u16_s32): Likewise.
                        (vreinterpretq_u16_s64): Likewise.
                        (vreinterpretq_u16_s8): Likewise.
                        (vreinterpretq_u16_u32): Likewise.
                        (vreinterpretq_u16_u64): Likewise.
                        (vreinterpretq_u16_u8): Likewise.
                        (vreinterpretq_u32_s16): Likewise.
                        (vreinterpretq_u32_s32): Likewise.
                        (vreinterpretq_u32_s64): Likewise.
                        (vreinterpretq_u32_s8): Likewise.
                        (vreinterpretq_u32_u16): Likewise.
                        (vreinterpretq_u32_u64): Likewise.
                        (vreinterpretq_u32_u8): Likewise.
                        (vreinterpretq_u64_s16): Likewise.
                        (vreinterpretq_u64_s32): Likewise.
                        (vreinterpretq_u64_s64): Likewise.
                        (vreinterpretq_u64_s8): Likewise.
                        (vreinterpretq_u64_u16): Likewise.
                        (vreinterpretq_u64_u32): Likewise.
                        (vreinterpretq_u64_u8): Likewise.
                        (vreinterpretq_u8_s16): Likewise.
                        (vreinterpretq_u8_s32): Likewise.
                        (vreinterpretq_u8_s64): Likewise.
                        (vreinterpretq_u8_s8): Likewise.
                        (vreinterpretq_u8_u16): Likewise.
                        (vreinterpretq_u8_u32): Likewise.
                        (vreinterpretq_u8_u64): Likewise.
                        (vreinterpretq_s32_f16): Likewise.
                        (vreinterpretq_s32_f32): Likewise.
                        (vreinterpretq_u16_f16): Likewise.
                        (vreinterpretq_u16_f32): Likewise.
                        (vreinterpretq_u32_f16): Likewise.
                        (vreinterpretq_u32_f32): Likewise.
                        (vreinterpretq_u64_f16): Likewise.
                        (vreinterpretq_u64_f32): Likewise.
                        (vreinterpretq_u8_f16): Likewise.
                        (vreinterpretq_u8_f32): Likewise.
                        (vreinterpretq_f16_f32): Likewise.
                        (vreinterpretq_f16_s16): Likewise.
                        (vreinterpretq_f16_s32): Likewise.
                        (vreinterpretq_f16_s64): Likewise.
                        (vreinterpretq_f16_s8): Likewise.
                        (vreinterpretq_f16_u16): Likewise.
                        (vreinterpretq_f16_u32): Likewise.
                        (vreinterpretq_f16_u64): Likewise.
                        (vreinterpretq_f16_u8): Likewise.
                        (vreinterpretq_f32_f16): Likewise.
                        (vreinterpretq_f32_s16): Likewise.
                        (vreinterpretq_f32_s32): Likewise.
                        (vreinterpretq_f32_s64): Likewise.
                        (vreinterpretq_f32_s8): Likewise.
                        (vreinterpretq_f32_u16): Likewise.
                        (vreinterpretq_f32_u32): Likewise.
                        (vreinterpretq_f32_u64): Likewise.
                        (vreinterpretq_f32_u8): Likewise.
                        (vreinterpretq_s16_f16): Likewise.
                        (vreinterpretq_s16_f32): Likewise.
                        (vreinterpretq_s64_f16): Likewise.
                        (vreinterpretq_s64_f32): Likewise.
                        (vreinterpretq_s8_f16): Likewise.
                        (vreinterpretq_s8_f32): Likewise.
                        (__arm_vreinterpretq_f16): Likewise.
                        (__arm_vreinterpretq_f32): Likewise.
                        (__arm_vreinterpretq_s16): Likewise.
                        (__arm_vreinterpretq_s32): Likewise.
                        (__arm_vreinterpretq_s64): Likewise.
                        (__arm_vreinterpretq_s8): Likewise.
                        (__arm_vreinterpretq_u16): Likewise.
                        (__arm_vreinterpretq_u32): Likewise.
                        (__arm_vreinterpretq_u64): Likewise.
                        (__arm_vreinterpretq_u8): Likewise.
                        * config/arm/arm_mve_types.h
(__arm_vreinterpretq_s16_s32):
                Remove.
                        (__arm_vreinterpretq_s16_s64): Likewise.
                        (__arm_vreinterpretq_s16_s8): Likewise.
                        (__arm_vreinterpretq_s16_u16): Likewise.
                        (__arm_vreinterpretq_s16_u32): Likewise.
                        (__arm_vreinterpretq_s16_u64): Likewise.
                        (__arm_vreinterpretq_s16_u8): Likewise.
                        (__arm_vreinterpretq_s32_s16): Likewise.
                        (__arm_vreinterpretq_s32_s64): Likewise.
                        (__arm_vreinterpretq_s32_s8): Likewise.
                        (__arm_vreinterpretq_s32_u16): Likewise.
                        (__arm_vreinterpretq_s32_u32): Likewise.
                        (__arm_vreinterpretq_s32_u64): Likewise.
                        (__arm_vreinterpretq_s32_u8): Likewise.
                        (__arm_vreinterpretq_s64_s16): Likewise.
                        (__arm_vreinterpretq_s64_s32): Likewise.
                        (__arm_vreinterpretq_s64_s8): Likewise.
                        (__arm_vreinterpretq_s64_u16): Likewise.
                        (__arm_vreinterpretq_s64_u32): Likewise.
                        (__arm_vreinterpretq_s64_u64): Likewise.
                        (__arm_vreinterpretq_s64_u8): Likewise.
                        (__arm_vreinterpretq_s8_s16): Likewise.
                        (__arm_vreinterpretq_s8_s32): Likewise.
                        (__arm_vreinterpretq_s8_s64): Likewise.
                        (__arm_vreinterpretq_s8_u16): Likewise.
                        (__arm_vreinterpretq_s8_u32): Likewise.
                        (__arm_vreinterpretq_s8_u64): Likewise.
                        (__arm_vreinterpretq_s8_u8): Likewise.
                        (__arm_vreinterpretq_u16_s16): Likewise.
                        (__arm_vreinterpretq_u16_s32): Likewise.
                        (__arm_vreinterpretq_u16_s64): Likewise.
                        (__arm_vreinterpretq_u16_s8): Likewise.
                        (__arm_vreinterpretq_u16_u32): Likewise.
                        (__arm_vreinterpretq_u16_u64): Likewise.
                        (__arm_vreinterpretq_u16_u8): Likewise.
                        (__arm_vreinterpretq_u32_s16): Likewise.
                        (__arm_vreinterpretq_u32_s32): Likewise.
                        (__arm_vreinterpretq_u32_s64): Likewise.
                        (__arm_vreinterpretq_u32_s8): Likewise.
                        (__arm_vreinterpretq_u32_u16): Likewise.
                        (__arm_vreinterpretq_u32_u64): Likewise.
                        (__arm_vreinterpretq_u32_u8): Likewise.
                        (__arm_vreinterpretq_u64_s16): Likewise.
                        (__arm_vreinterpretq_u64_s32): Likewise.
                        (__arm_vreinterpretq_u64_s64): Likewise.
                        (__arm_vreinterpretq_u64_s8): Likewise.
                        (__arm_vreinterpretq_u64_u16): Likewise.
                        (__arm_vreinterpretq_u64_u32): Likewise.
                        (__arm_vreinterpretq_u64_u8): Likewise.
                        (__arm_vreinterpretq_u8_s16): Likewise.
                        (__arm_vreinterpretq_u8_s32): Likewise.
                        (__arm_vreinterpretq_u8_s64): Likewise.
                        (__arm_vreinterpretq_u8_s8): Likewise.
                        (__arm_vreinterpretq_u8_u16): Likewise.
                        (__arm_vreinterpretq_u8_u32): Likewise.
                        (__arm_vreinterpretq_u8_u64): Likewise.
                        (__arm_vreinterpretq_s32_f16): Likewise.
                        (__arm_vreinterpretq_s32_f32): Likewise.
                        (__arm_vreinterpretq_s16_f16): Likewise.
                        (__arm_vreinterpretq_s16_f32): Likewise.
                        (__arm_vreinterpretq_s64_f16): Likewise.
                        (__arm_vreinterpretq_s64_f32): Likewise.
                        (__arm_vreinterpretq_s8_f16): Likewise.
                        (__arm_vreinterpretq_s8_f32): Likewise.
                        (__arm_vreinterpretq_u16_f16): Likewise.
                        (__arm_vreinterpretq_u16_f32): Likewise.
                        (__arm_vreinterpretq_u32_f16): Likewise.
                        (__arm_vreinterpretq_u32_f32): Likewise.
                        (__arm_vreinterpretq_u64_f16): Likewise.
                        (__arm_vreinterpretq_u64_f32): Likewise.
                        (__arm_vreinterpretq_u8_f16): Likewise.
                        (__arm_vreinterpretq_u8_f32): Likewise.
                        (__arm_vreinterpretq_f16_f32): Likewise.
                        (__arm_vreinterpretq_f16_s16): Likewise.
                        (__arm_vreinterpretq_f16_s32): Likewise.
                        (__arm_vreinterpretq_f16_s64): Likewise.
                        (__arm_vreinterpretq_f16_s8): Likewise.
                        (__arm_vreinterpretq_f16_u16): Likewise.
                        (__arm_vreinterpretq_f16_u32): Likewise.
                        (__arm_vreinterpretq_f16_u64): Likewise.
                        (__arm_vreinterpretq_f16_u8): Likewise.
                        (__arm_vreinterpretq_f32_f16): Likewise.
                        (__arm_vreinterpretq_f32_s16): Likewise.
                        (__arm_vreinterpretq_f32_s32): Likewise.
                        (__arm_vreinterpretq_f32_s64): Likewise.
                        (__arm_vreinterpretq_f32_s8): Likewise.
                        (__arm_vreinterpretq_f32_u16): Likewise.
                        (__arm_vreinterpretq_f32_u32): Likewise.
                        (__arm_vreinterpretq_f32_u64): Likewise.
                        (__arm_vreinterpretq_f32_u8): Likewise.
                        (__arm_vreinterpretq_s16): Likewise.
                        (__arm_vreinterpretq_s32): Likewise.
                        (__arm_vreinterpretq_s64): Likewise.
                        (__arm_vreinterpretq_s8): Likewise.
                        (__arm_vreinterpretq_u16): Likewise.
                        (__arm_vreinterpretq_u32): Likewise.
                        (__arm_vreinterpretq_u64): Likewise.
                        (__arm_vreinterpretq_u8): Likewise.
                        (__arm_vreinterpretq_f16): Likewise.
                        (__arm_vreinterpretq_f32): Likewise.
                        * config/arm/mve.md
(@arm_mve_reinterpret<mode>): New
                pattern.
                        * config/arm/unspecs.md: (REINTERPRET): New
unspec.

                        gcc/testsuite/
                        * g++.target/arm/mve.exp: Add general-c++ and
general directories.
                        * g++.target/arm/mve/general-c++/nomve_fp_1.c:
New test.
                        * g++.target/arm/mve/general-c++/vreinterpretq_1.C:
New test.
                        * gcc.target/arm/mve/general-c/nomve_fp_1.c: New
test.
                        * gcc.target/arm/mve/general-c/vreinterpretq_1.c:
New test.
                ---
                 gcc/config/arm/arm-mve-builtins-base.cc       |   29 +
                 gcc/config/arm/arm-mve-builtins-base.def      |    2 +
                 gcc/config/arm/arm-mve-builtins-base.h        |    2 +
                 gcc/config/arm/arm-mve-builtins-shapes.cc     |   28 +
                 gcc/config/arm/arm-mve-builtins-shapes.h      |    8 +
                 gcc/config/arm/arm-mve-builtins.cc            |   60 +
                 gcc/config/arm/arm_mve.h                      |  300 ----
                 gcc/config/arm/arm_mve_types.h                | 1365 
+-------------
---
                 gcc/config/arm/mve.md                         |   18 +
                 gcc/config/arm/unspecs.md                     |    1 +
                 gcc/testsuite/g++.target/arm/mve.exp          |    8 +-
                 .../arm/mve/general-c++/nomve_fp_1.c          |   15 +
                 .../arm/mve/general-c++/vreinterpretq_1.C     |   25 +
                 .../gcc.target/arm/mve/general-c/nomve_fp_1.c |   15 +
                 .../arm/mve/general-c/vreinterpretq_1.c       |   25 +
                 15 files changed, 286 insertions(+), 1615 deletions(-)
                 create mode 100644
gcc/testsuite/g++.target/arm/mve/general-
                c++/nomve_fp_1.c
                 create mode 100644
gcc/testsuite/g++.target/arm/mve/general-
                c++/vreinterpretq_1.C
                 create mode 100644
gcc/testsuite/gcc.target/arm/mve/general-
                c/nomve_fp_1.c
                 create mode 100644
gcc/testsuite/gcc.target/arm/mve/general-
                c/vreinterpretq_1.c

                diff --git a/gcc/config/arm/arm-mve-builtins-base.cc
b/gcc/config/arm/arm-
                mve-builtins-base.cc
                index e9f285faf2b..ad8d500afc6 100644
                --- a/gcc/config/arm/arm-mve-builtins-base.cc
                +++ b/gcc/config/arm/arm-mve-builtins-base.cc
                @@ -38,8 +38,37 @@ using namespace arm_mve;

                 namespace {

                +/* Implements vreinterpretq_* intrinsics.  */
                +class vreinterpretq_impl : public quiet<function_base>
                +{
                +  gimple *
                +  fold (gimple_folder &f) const override
                +  {
                +    /* Punt to rtl if the effect of the reinterpret on 
registers
does not
                +       conform to GCC's endianness model.  */
                +    if (!targetm.can_change_mode_class (f.vector_mode (0),
                +                                       f.vector_mode (1),
VFP_REGS))
                +      return NULL;
                +


        So we punt to an RTL pattern here if we cannot change mode class...

        [snip]


                diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
                index 35eab6c94bf..ab688396f97 100644
                --- a/gcc/config/arm/mve.md
                +++ b/gcc/config/arm/mve.md
                @@ -10561,3 +10561,21 @@ (define_expand
                "vcond_mask_<mode><MVE_vpred>"
                     }
                   DONE;
                 })
                +
                +;; Reinterpret operand 1 in operand 0's mode, without
changing its contents.
                +(define_expand "@arm_mve_reinterpret<mode>"
                +  [(set (match_operand:MVE_vecs 0 "register_operand")
                +       (unspec:MVE_vecs
                +         [(match_operand 1 "arm_any_register_operand")]
                +         REINTERPRET))]
                +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE
(<MODE>mode))
                +    || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
                (<MODE>mode))"
                +  {
                +    machine_mode src_mode = GET_MODE (operands[1]);
                +    if (targetm.can_change_mode_class (<MODE>mode,
src_mode,
                VFP_REGS))
                +      {
                +       emit_move_insn (operands[0], gen_lowpart
(<MODE>mode,
                operands[1]));
                +       DONE;
                +      }
                +  }
                +)


        ... But we still check can_change_mode_class in this pattern and if it's
not true we emit the new REINTERPRET unspec
        without a corresponding define_insn pattern. Won't that ICE? Would
this case occur on big-endian targets?




Looks like you are right. However, arm_mve.h is protected by:

#if __ARM_BIG_ENDIAN

#error "MVE intrinsics are not supported in Big-Endian mode."




Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE.




In fact, this pattern and vreinterpretq_impl above are quite similar to the
aarch64 implementation.

I tried with a sample

svint16_t foo(svint8_t value1)
{
return svreinterpret_s16_s8(value1);
}
and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK,
although
aarch64_can_change_mode_class() has:
if (BYTES_BIG_ENDIAN)
...
if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE
(to))
return false;
so it should have a similar problem? I', not sure why it doesn't ICE?

I believe that's because there's a pattern in aarch64-sve.md that converts 
everything into a simple set with the right modes forced in.

;; A pattern for handling type punning on big-endian targets.  We use a
;; special predicate for operand 1 to reduce the number of patterns.
(define_insn_and_split "*aarch64_sve_reinterpret<mode>"
   [(set (match_operand:SVE_ALL 0 "register_operand" "=w")
         (unspec:SVE_ALL
           [(match_operand 1 "aarch64_any_register_operand" "w")]
           UNSPEC_REINTERPRET))]
   "TARGET_SVE"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 1))]
   {
     operands[1] = aarch64_replace_reg_mode (operands[1], <MODE>mode);
   }
)

Ha, right, thanks.

I guess since we don't claim to support big-endian MVE for now we probably 
don't need to handle it, but I wonder whether we should instead
be asserting that targetm.can_change_mode_class is true in the folding code and 
adding a comment that it for future big-endian support it should be handled 
properly in the .md file as on aarch64?


Sure, I can easily add an assert, will do for v2.

Thanks,

Christophe

Thanks,
Kyrill


Thanks,
Christophe



        Thanks,
        Kyrill


                diff --git a/gcc/config/arm/unspecs.md
b/gcc/config/arm/unspecs.md
                index 84384ee798d..dccda283573 100644
                --- a/gcc/config/arm/unspecs.md
                +++ b/gcc/config/arm/unspecs.md
                @@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [
                   SQRSHRL_64
                   SQRSHRL_48
                   VSHLCQ_M_
                +  REINTERPRET
                 ])



Reply via email to