On 5/2/23 17:28, Kyrylo Tkachov wrote:
-----Original Message-----
From: Christophe Lyon <christophe.l...@arm.com>
Sent: Tuesday, May 2, 2023 3:05 PM
To: Kyrylo Tkachov <kyrylo.tkac...@arm.com>; gcc-patches@gcc.gnu.org;
Richard Earnshaw <richard.earns...@arm.com>; Richard Sandiford
<richard.sandif...@arm.com>
Subject: Re: [PATCH 03/22] arm: [MVE intrinsics] Rework vreinterpretq
On 5/2/23 12:26, Kyrylo Tkachov wrote:
Hi Christophe,
-----Original Message-----
From: Christophe Lyon <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>
Sent: Tuesday, April 18, 2023 2:46 PM
To: gcc-patches@gcc.gnu.org <mailto:gcc-
patc...@gcc.gnu.org> ; Kyrylo Tkachov <kyrylo.tkac...@arm.com>
<mailto:kyrylo.tkac...@arm.com> ;
Richard Earnshaw <richard.earns...@arm.com>
<mailto:richard.earns...@arm.com> ; Richard Sandiford
<richard.sandif...@arm.com>
<mailto:richard.sandif...@arm.com>
Cc: Christophe Lyon <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>
Subject: [PATCH 03/22] arm: [MVE intrinsics] Rework
vreinterpretq
This patch implements vreinterpretq using the new MVE
intrinsics
framework.
The old definitions for vreinterpretq are removed as a
consequence.
2022-09-08 Murray Steele <murray.ste...@arm.com>
<mailto:murray.ste...@arm.com>
Christophe Lyon <christophe.l...@arm.com>
<mailto:christophe.l...@arm.com>
gcc/
* config/arm/arm-mve-builtins-base.cc
(vreinterpretq_impl): New
class.
* config/arm/arm-mve-builtins-base.def: Define
vreinterpretq.
* config/arm/arm-mve-builtins-base.h
(vreinterpretq): New
declaration.
* config/arm/arm-mve-builtins-shapes.cc
(parse_element_type): New
function.
(parse_type): Likewise.
(parse_signature): Likewise.
(build_one): Likewise.
(build_all): Likewise.
(overloaded_base): New struct.
(unary_convert_def): Likewise.
* config/arm/arm-mve-builtins-shapes.h
(unary_convert): Declare.
* config/arm/arm-mve-builtins.cc
(TYPES_reinterpret_signed1): New
macro.
(TYPES_reinterpret_unsigned1): Likewise.
(TYPES_reinterpret_integer): Likewise.
(TYPES_reinterpret_integer1): Likewise.
(TYPES_reinterpret_float1): Likewise.
(TYPES_reinterpret_float): Likewise.
(reinterpret_integer): New.
(reinterpret_float): New.
(handle_arm_mve_h): Register builtins.
* config/arm/arm_mve.h (vreinterpretq_s16):
Remove.
(vreinterpretq_s32): Likewise.
(vreinterpretq_s64): Likewise.
(vreinterpretq_s8): Likewise.
(vreinterpretq_u16): Likewise.
(vreinterpretq_u32): Likewise.
(vreinterpretq_u64): Likewise.
(vreinterpretq_u8): Likewise.
(vreinterpretq_f16): Likewise.
(vreinterpretq_f32): Likewise.
(vreinterpretq_s16_s32): Likewise.
(vreinterpretq_s16_s64): Likewise.
(vreinterpretq_s16_s8): Likewise.
(vreinterpretq_s16_u16): Likewise.
(vreinterpretq_s16_u32): Likewise.
(vreinterpretq_s16_u64): Likewise.
(vreinterpretq_s16_u8): Likewise.
(vreinterpretq_s32_s16): Likewise.
(vreinterpretq_s32_s64): Likewise.
(vreinterpretq_s32_s8): Likewise.
(vreinterpretq_s32_u16): Likewise.
(vreinterpretq_s32_u32): Likewise.
(vreinterpretq_s32_u64): Likewise.
(vreinterpretq_s32_u8): Likewise.
(vreinterpretq_s64_s16): Likewise.
(vreinterpretq_s64_s32): Likewise.
(vreinterpretq_s64_s8): Likewise.
(vreinterpretq_s64_u16): Likewise.
(vreinterpretq_s64_u32): Likewise.
(vreinterpretq_s64_u64): Likewise.
(vreinterpretq_s64_u8): Likewise.
(vreinterpretq_s8_s16): Likewise.
(vreinterpretq_s8_s32): Likewise.
(vreinterpretq_s8_s64): Likewise.
(vreinterpretq_s8_u16): Likewise.
(vreinterpretq_s8_u32): Likewise.
(vreinterpretq_s8_u64): Likewise.
(vreinterpretq_s8_u8): Likewise.
(vreinterpretq_u16_s16): Likewise.
(vreinterpretq_u16_s32): Likewise.
(vreinterpretq_u16_s64): Likewise.
(vreinterpretq_u16_s8): Likewise.
(vreinterpretq_u16_u32): Likewise.
(vreinterpretq_u16_u64): Likewise.
(vreinterpretq_u16_u8): Likewise.
(vreinterpretq_u32_s16): Likewise.
(vreinterpretq_u32_s32): Likewise.
(vreinterpretq_u32_s64): Likewise.
(vreinterpretq_u32_s8): Likewise.
(vreinterpretq_u32_u16): Likewise.
(vreinterpretq_u32_u64): Likewise.
(vreinterpretq_u32_u8): Likewise.
(vreinterpretq_u64_s16): Likewise.
(vreinterpretq_u64_s32): Likewise.
(vreinterpretq_u64_s64): Likewise.
(vreinterpretq_u64_s8): Likewise.
(vreinterpretq_u64_u16): Likewise.
(vreinterpretq_u64_u32): Likewise.
(vreinterpretq_u64_u8): Likewise.
(vreinterpretq_u8_s16): Likewise.
(vreinterpretq_u8_s32): Likewise.
(vreinterpretq_u8_s64): Likewise.
(vreinterpretq_u8_s8): Likewise.
(vreinterpretq_u8_u16): Likewise.
(vreinterpretq_u8_u32): Likewise.
(vreinterpretq_u8_u64): Likewise.
(vreinterpretq_s32_f16): Likewise.
(vreinterpretq_s32_f32): Likewise.
(vreinterpretq_u16_f16): Likewise.
(vreinterpretq_u16_f32): Likewise.
(vreinterpretq_u32_f16): Likewise.
(vreinterpretq_u32_f32): Likewise.
(vreinterpretq_u64_f16): Likewise.
(vreinterpretq_u64_f32): Likewise.
(vreinterpretq_u8_f16): Likewise.
(vreinterpretq_u8_f32): Likewise.
(vreinterpretq_f16_f32): Likewise.
(vreinterpretq_f16_s16): Likewise.
(vreinterpretq_f16_s32): Likewise.
(vreinterpretq_f16_s64): Likewise.
(vreinterpretq_f16_s8): Likewise.
(vreinterpretq_f16_u16): Likewise.
(vreinterpretq_f16_u32): Likewise.
(vreinterpretq_f16_u64): Likewise.
(vreinterpretq_f16_u8): Likewise.
(vreinterpretq_f32_f16): Likewise.
(vreinterpretq_f32_s16): Likewise.
(vreinterpretq_f32_s32): Likewise.
(vreinterpretq_f32_s64): Likewise.
(vreinterpretq_f32_s8): Likewise.
(vreinterpretq_f32_u16): Likewise.
(vreinterpretq_f32_u32): Likewise.
(vreinterpretq_f32_u64): Likewise.
(vreinterpretq_f32_u8): Likewise.
(vreinterpretq_s16_f16): Likewise.
(vreinterpretq_s16_f32): Likewise.
(vreinterpretq_s64_f16): Likewise.
(vreinterpretq_s64_f32): Likewise.
(vreinterpretq_s8_f16): Likewise.
(vreinterpretq_s8_f32): Likewise.
(__arm_vreinterpretq_f16): Likewise.
(__arm_vreinterpretq_f32): Likewise.
(__arm_vreinterpretq_s16): Likewise.
(__arm_vreinterpretq_s32): Likewise.
(__arm_vreinterpretq_s64): Likewise.
(__arm_vreinterpretq_s8): Likewise.
(__arm_vreinterpretq_u16): Likewise.
(__arm_vreinterpretq_u32): Likewise.
(__arm_vreinterpretq_u64): Likewise.
(__arm_vreinterpretq_u8): Likewise.
* config/arm/arm_mve_types.h
(__arm_vreinterpretq_s16_s32):
Remove.
(__arm_vreinterpretq_s16_s64): Likewise.
(__arm_vreinterpretq_s16_s8): Likewise.
(__arm_vreinterpretq_s16_u16): Likewise.
(__arm_vreinterpretq_s16_u32): Likewise.
(__arm_vreinterpretq_s16_u64): Likewise.
(__arm_vreinterpretq_s16_u8): Likewise.
(__arm_vreinterpretq_s32_s16): Likewise.
(__arm_vreinterpretq_s32_s64): Likewise.
(__arm_vreinterpretq_s32_s8): Likewise.
(__arm_vreinterpretq_s32_u16): Likewise.
(__arm_vreinterpretq_s32_u32): Likewise.
(__arm_vreinterpretq_s32_u64): Likewise.
(__arm_vreinterpretq_s32_u8): Likewise.
(__arm_vreinterpretq_s64_s16): Likewise.
(__arm_vreinterpretq_s64_s32): Likewise.
(__arm_vreinterpretq_s64_s8): Likewise.
(__arm_vreinterpretq_s64_u16): Likewise.
(__arm_vreinterpretq_s64_u32): Likewise.
(__arm_vreinterpretq_s64_u64): Likewise.
(__arm_vreinterpretq_s64_u8): Likewise.
(__arm_vreinterpretq_s8_s16): Likewise.
(__arm_vreinterpretq_s8_s32): Likewise.
(__arm_vreinterpretq_s8_s64): Likewise.
(__arm_vreinterpretq_s8_u16): Likewise.
(__arm_vreinterpretq_s8_u32): Likewise.
(__arm_vreinterpretq_s8_u64): Likewise.
(__arm_vreinterpretq_s8_u8): Likewise.
(__arm_vreinterpretq_u16_s16): Likewise.
(__arm_vreinterpretq_u16_s32): Likewise.
(__arm_vreinterpretq_u16_s64): Likewise.
(__arm_vreinterpretq_u16_s8): Likewise.
(__arm_vreinterpretq_u16_u32): Likewise.
(__arm_vreinterpretq_u16_u64): Likewise.
(__arm_vreinterpretq_u16_u8): Likewise.
(__arm_vreinterpretq_u32_s16): Likewise.
(__arm_vreinterpretq_u32_s32): Likewise.
(__arm_vreinterpretq_u32_s64): Likewise.
(__arm_vreinterpretq_u32_s8): Likewise.
(__arm_vreinterpretq_u32_u16): Likewise.
(__arm_vreinterpretq_u32_u64): Likewise.
(__arm_vreinterpretq_u32_u8): Likewise.
(__arm_vreinterpretq_u64_s16): Likewise.
(__arm_vreinterpretq_u64_s32): Likewise.
(__arm_vreinterpretq_u64_s64): Likewise.
(__arm_vreinterpretq_u64_s8): Likewise.
(__arm_vreinterpretq_u64_u16): Likewise.
(__arm_vreinterpretq_u64_u32): Likewise.
(__arm_vreinterpretq_u64_u8): Likewise.
(__arm_vreinterpretq_u8_s16): Likewise.
(__arm_vreinterpretq_u8_s32): Likewise.
(__arm_vreinterpretq_u8_s64): Likewise.
(__arm_vreinterpretq_u8_s8): Likewise.
(__arm_vreinterpretq_u8_u16): Likewise.
(__arm_vreinterpretq_u8_u32): Likewise.
(__arm_vreinterpretq_u8_u64): Likewise.
(__arm_vreinterpretq_s32_f16): Likewise.
(__arm_vreinterpretq_s32_f32): Likewise.
(__arm_vreinterpretq_s16_f16): Likewise.
(__arm_vreinterpretq_s16_f32): Likewise.
(__arm_vreinterpretq_s64_f16): Likewise.
(__arm_vreinterpretq_s64_f32): Likewise.
(__arm_vreinterpretq_s8_f16): Likewise.
(__arm_vreinterpretq_s8_f32): Likewise.
(__arm_vreinterpretq_u16_f16): Likewise.
(__arm_vreinterpretq_u16_f32): Likewise.
(__arm_vreinterpretq_u32_f16): Likewise.
(__arm_vreinterpretq_u32_f32): Likewise.
(__arm_vreinterpretq_u64_f16): Likewise.
(__arm_vreinterpretq_u64_f32): Likewise.
(__arm_vreinterpretq_u8_f16): Likewise.
(__arm_vreinterpretq_u8_f32): Likewise.
(__arm_vreinterpretq_f16_f32): Likewise.
(__arm_vreinterpretq_f16_s16): Likewise.
(__arm_vreinterpretq_f16_s32): Likewise.
(__arm_vreinterpretq_f16_s64): Likewise.
(__arm_vreinterpretq_f16_s8): Likewise.
(__arm_vreinterpretq_f16_u16): Likewise.
(__arm_vreinterpretq_f16_u32): Likewise.
(__arm_vreinterpretq_f16_u64): Likewise.
(__arm_vreinterpretq_f16_u8): Likewise.
(__arm_vreinterpretq_f32_f16): Likewise.
(__arm_vreinterpretq_f32_s16): Likewise.
(__arm_vreinterpretq_f32_s32): Likewise.
(__arm_vreinterpretq_f32_s64): Likewise.
(__arm_vreinterpretq_f32_s8): Likewise.
(__arm_vreinterpretq_f32_u16): Likewise.
(__arm_vreinterpretq_f32_u32): Likewise.
(__arm_vreinterpretq_f32_u64): Likewise.
(__arm_vreinterpretq_f32_u8): Likewise.
(__arm_vreinterpretq_s16): Likewise.
(__arm_vreinterpretq_s32): Likewise.
(__arm_vreinterpretq_s64): Likewise.
(__arm_vreinterpretq_s8): Likewise.
(__arm_vreinterpretq_u16): Likewise.
(__arm_vreinterpretq_u32): Likewise.
(__arm_vreinterpretq_u64): Likewise.
(__arm_vreinterpretq_u8): Likewise.
(__arm_vreinterpretq_f16): Likewise.
(__arm_vreinterpretq_f32): Likewise.
* config/arm/mve.md
(@arm_mve_reinterpret<mode>): New
pattern.
* config/arm/unspecs.md: (REINTERPRET): New
unspec.
gcc/testsuite/
* g++.target/arm/mve.exp: Add general-c++ and
general directories.
* g++.target/arm/mve/general-c++/nomve_fp_1.c:
New test.
* g++.target/arm/mve/general-c++/vreinterpretq_1.C:
New test.
* gcc.target/arm/mve/general-c/nomve_fp_1.c: New
test.
* gcc.target/arm/mve/general-c/vreinterpretq_1.c:
New test.
---
gcc/config/arm/arm-mve-builtins-base.cc | 29 +
gcc/config/arm/arm-mve-builtins-base.def | 2 +
gcc/config/arm/arm-mve-builtins-base.h | 2 +
gcc/config/arm/arm-mve-builtins-shapes.cc | 28 +
gcc/config/arm/arm-mve-builtins-shapes.h | 8 +
gcc/config/arm/arm-mve-builtins.cc | 60 +
gcc/config/arm/arm_mve.h | 300 ----
gcc/config/arm/arm_mve_types.h | 1365
+-------------
---
gcc/config/arm/mve.md | 18 +
gcc/config/arm/unspecs.md | 1 +
gcc/testsuite/g++.target/arm/mve.exp | 8 +-
.../arm/mve/general-c++/nomve_fp_1.c | 15 +
.../arm/mve/general-c++/vreinterpretq_1.C | 25 +
.../gcc.target/arm/mve/general-c/nomve_fp_1.c | 15 +
.../arm/mve/general-c/vreinterpretq_1.c | 25 +
15 files changed, 286 insertions(+), 1615 deletions(-)
create mode 100644
gcc/testsuite/g++.target/arm/mve/general-
c++/nomve_fp_1.c
create mode 100644
gcc/testsuite/g++.target/arm/mve/general-
c++/vreinterpretq_1.C
create mode 100644
gcc/testsuite/gcc.target/arm/mve/general-
c/nomve_fp_1.c
create mode 100644
gcc/testsuite/gcc.target/arm/mve/general-
c/vreinterpretq_1.c
diff --git a/gcc/config/arm/arm-mve-builtins-base.cc
b/gcc/config/arm/arm-
mve-builtins-base.cc
index e9f285faf2b..ad8d500afc6 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -38,8 +38,37 @@ using namespace arm_mve;
namespace {
+/* Implements vreinterpretq_* intrinsics. */
+class vreinterpretq_impl : public quiet<function_base>
+{
+ gimple *
+ fold (gimple_folder &f) const override
+ {
+ /* Punt to rtl if the effect of the reinterpret on
registers
does not
+ conform to GCC's endianness model. */
+ if (!targetm.can_change_mode_class (f.vector_mode (0),
+ f.vector_mode (1),
VFP_REGS))
+ return NULL;
+
So we punt to an RTL pattern here if we cannot change mode class...
[snip]
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 35eab6c94bf..ab688396f97 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -10561,3 +10561,21 @@ (define_expand
"vcond_mask_<mode><MVE_vpred>"
}
DONE;
})
+
+;; Reinterpret operand 1 in operand 0's mode, without
changing its contents.
+(define_expand "@arm_mve_reinterpret<mode>"
+ [(set (match_operand:MVE_vecs 0 "register_operand")
+ (unspec:MVE_vecs
+ [(match_operand 1 "arm_any_register_operand")]
+ REINTERPRET))]
+ "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE
(<MODE>mode))
+ || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
(<MODE>mode))"
+ {
+ machine_mode src_mode = GET_MODE (operands[1]);
+ if (targetm.can_change_mode_class (<MODE>mode,
src_mode,
VFP_REGS))
+ {
+ emit_move_insn (operands[0], gen_lowpart
(<MODE>mode,
operands[1]));
+ DONE;
+ }
+ }
+)
... But we still check can_change_mode_class in this pattern and if it's
not true we emit the new REINTERPRET unspec
without a corresponding define_insn pattern. Won't that ICE? Would
this case occur on big-endian targets?
Looks like you are right. However, arm_mve.h is protected by:
#if __ARM_BIG_ENDIAN
#error "MVE intrinsics are not supported in Big-Endian mode."
Just tried to hack my arm_mve.h to accept big-endian, and indeed we do ICE.
In fact, this pattern and vreinterpretq_impl above are quite similar to the
aarch64 implementation.
I tried with a sample
svint16_t foo(svint8_t value1)
{
return svreinterpret_s16_s8(value1);
}
and it seems aarch64-none-elf-gcc -march=armv8.2-a+sve -mbig-endian is OK,
although
aarch64_can_change_mode_class() has:
if (BYTES_BIG_ENDIAN)
...
if (from_sve_p && GET_MODE_UNIT_SIZE (from) != GET_MODE_UNIT_SIZE
(to))
return false;
so it should have a similar problem? I', not sure why it doesn't ICE?
I believe that's because there's a pattern in aarch64-sve.md that converts
everything into a simple set with the right modes forced in.
;; A pattern for handling type punning on big-endian targets. We use a
;; special predicate for operand 1 to reduce the number of patterns.
(define_insn_and_split "*aarch64_sve_reinterpret<mode>"
[(set (match_operand:SVE_ALL 0 "register_operand" "=w")
(unspec:SVE_ALL
[(match_operand 1 "aarch64_any_register_operand" "w")]
UNSPEC_REINTERPRET))]
"TARGET_SVE"
"#"
"&& reload_completed"
[(set (match_dup 0) (match_dup 1))]
{
operands[1] = aarch64_replace_reg_mode (operands[1], <MODE>mode);
}
)
Ha, right, thanks.
I guess since we don't claim to support big-endian MVE for now we probably
don't need to handle it, but I wonder whether we should instead
be asserting that targetm.can_change_mode_class is true in the folding code and
adding a comment that it for future big-endian support it should be handled
properly in the .md file as on aarch64?
Sure, I can easily add an assert, will do for v2.
Thanks,
Christophe
Thanks,
Kyrill
Thanks,
Christophe
Thanks,
Kyrill
diff --git a/gcc/config/arm/unspecs.md
b/gcc/config/arm/unspecs.md
index 84384ee798d..dccda283573 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -1255,4 +1255,5 @@ (define_c_enum "unspec" [
SQRSHRL_64
SQRSHRL_48
VSHLCQ_M_
+ REINTERPRET
])