From: Karl Meakin <[email protected]>
Add all the necessary infrastructure for defining NEON intrinsics using the
pragma-based framework,
and port the `vadd` family of functions to demonstrate that it works.
gcc/ChangeLog:
* config/aarch64/aarch64-neon-builtins-base.cc: New file.
* config/aarch64/aarch64-neon-builtins-base.def: New file.
* config/aarch64/aarch64-neon-builtins-base.h: New file.
* config/aarch64/aarch64-neon-builtins-functions.h: New file.
* config/aarch64/aarch64-neon-builtins-shapes.cc: New file.
* config/aarch64/aarch64-neon-builtins-shapes.h: New file.
* config/aarch64/aarch64-neon-builtins.cc: New file.
* config/aarch64/aarch64-neon-builtins.def: New file.
* config/aarch64/aarch64-neon-builtins.h: New file.
* config.gcc (extra_headers, extra_objs): Add new files and reformat
for readability.
* config/aarch64/t-aarch64: Add recipes for new files.
* config/aarch64/aarch64-protos.h (handle_arm_neon_h): Rename to
`init_arm_neon_builtins`.
* config/aarch64/aarch64-builtins.cc (handle_arm_neon_h): Likewise.
* config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64): Call
`aarch64_acle::handle_arm_neon_h`.
* config/aarch64/aarch64-sve-builtins.cc (TYPES_*): Move to
aarch64-acle-builtins.h
(NONSTREAMING_SVE, SVE_AND_SME, SSVE): Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc (build_all): Remove
`static` qualifier.
(gimple_folder::fold): Allow folding when `TARGET_SVE` is false but
`TARGET_SIMD` is true.
* config/aarch64/aarch64-sve-builtins.def (p8, p16, p64, 128): New type
suffixes.
* config/aarch64/aarch64-acle-builtins.h (enum handle_pragma_index):
New enum member
`arm_neon_handle`.
(enum type_class_index): New enum member `TYPE_poly`.
(build_all): New declaration, so it can be used from
`aarch64-neon-builtins.cc`.
(TYPES_*): Moved from `aarch64-sve-builtins.cc`.
(NONSTREAMING_SVE, SVE_AND_SME, SSVE): Likewise.
* config/aarch64/arm_neon.h (vadd_s8, vadd_s16, vadd_s32, vadd_f32,
vadd_f64, vadd_u8,
vadd_u16, vadd_u32, vadd_s64, vadd_u64, vaddq_s8, vaddq_s16, vaddq_s32,
vaddq_s64,
vaddq_f32, vaddq_f64, vaddq_u8, vaddq_u16, vaddq_u32, vaddq_u64,
vadd_f16, vaddq_f16,
vadd_p8, vadd_p16, vadd_p64, vaddq_p8, vaddq_p16, vaddq_p64,
vaddq_p128, vaddd_u64,
vaddd_s64): Delete function definitions.
gcc/testsuite/ChangeLog:
* g++.target/aarch64/pr103147-6.C: Fix tests.
* g++.target/aarch64/pr117048.C: Fix tests.
* gcc.target/aarch64/pr103147-6.c: Fix tests.
* gcc.target/aarch64/neon/aarch64-neon.exp: New test.
* gcc.target/aarch64/neon/arm_neon_test.h: New test.
* gcc.target/aarch64/neon/vadd.c: New test.
---
gcc/config.gcc | 20 +-
gcc/config/aarch64/aarch64-acle-builtins.h | 826 +++++++++++++++++
gcc/config/aarch64/aarch64-builtins.cc | 12 +-
gcc/config/aarch64/aarch64-c.cc | 3 +-
.../aarch64/aarch64-neon-builtins-base.cc | 113 +++
.../aarch64/aarch64-neon-builtins-base.def | 33 +
.../aarch64/aarch64-neon-builtins-base.h | 29 +
.../aarch64/aarch64-neon-builtins-functions.h | 29 +
.../aarch64/aarch64-neon-builtins-shapes.cc | 69 ++
.../aarch64/aarch64-neon-builtins-shapes.h | 29 +
gcc/config/aarch64/aarch64-neon-builtins.cc | 86 ++
gcc/config/aarch64/aarch64-neon-builtins.def | 40 +
gcc/config/aarch64/aarch64-neon-builtins.h | 28 +
gcc/config/aarch64/aarch64-protos.h | 2 +-
.../aarch64/aarch64-sve-builtins-shapes.cc | 16 +-
gcc/config/aarch64/aarch64-sve-builtins.cc | 855 +-----------------
gcc/config/aarch64/aarch64-sve-builtins.def | 11 +
gcc/config/aarch64/arm_neon.h | 204 -----
gcc/config/aarch64/t-aarch64 | 46 +
gcc/testsuite/g++.target/aarch64/pr103147-6.C | 1 +
gcc/testsuite/g++.target/aarch64/pr117048.C | 2 +-
.../gcc.target/aarch64/neon/aarch64-neon.exp | 39 +
.../gcc.target/aarch64/neon/arm_neon_test.h | 22 +
gcc/testsuite/gcc.target/aarch64/neon/vadd.c | 203 +++++
gcc/testsuite/gcc.target/aarch64/pr103147-6.c | 1 +
25 files changed, 1688 insertions(+), 1031 deletions(-)
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-base.cc
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-base.def
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-base.h
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-functions.h
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-shapes.cc
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins-shapes.h
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins.cc
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins.def
create mode 100644 gcc/config/aarch64/aarch64-neon-builtins.h
create mode 100644 gcc/testsuite/gcc.target/aarch64/neon/aarch64-neon.exp
create mode 100644 gcc/testsuite/gcc.target/aarch64/neon/arm_neon_test.h
create mode 100644 gcc/testsuite/gcc.target/aarch64/neon/vadd.c
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 580a7fdee6b5..c125096a65cd 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -192,7 +192,7 @@
# the --with-sysroot configure option or the
# --sysroot command line option is used this
# will be relative to the sysroot.
-# target_type_format_char
+# target_type_format_char
# The default character to be used for formatting
# the attribute in a
# .type symbol_name, ${t_t_f_c}<property>
@@ -361,7 +361,18 @@ cpu_is_64bit=
case ${target} in
aarch64*-*-*)
cpu_type=aarch64
- extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h
arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h arm_private_neon_types.h"
+ extra_headers=(
+ 'arm_fp16.h'
+ 'arm_neon.h'
+ 'arm_bf16.h'
+ 'arm_acle.h'
+ 'arm_sve.h'
+ 'arm_sme.h'
+ 'arm_neon_sve_bridge.h'
+ 'arm_private_fp8.h'
+ 'arm_private_neon_types.h'
+ )
+ extra_headers="${extra_headers[@]}"
c_target_objs="aarch64-c.o"
cxx_target_objs="aarch64-c.o"
d_target_objs="aarch64-d.o"
@@ -382,6 +393,9 @@ aarch64*-*-*)
'aarch64-json-tunings-printer.o'
'aarch64-json-tunings-parser.o'
'aarch64-narrow-gp-writes.o'
+ 'aarch64-neon-builtins.o'
+ 'aarch64-neon-builtins-base.o'
+ 'aarch64-neon-builtins-shapes.o'
)
extra_objs="${extra_objs[@]}"
target_gtfiles=(
@@ -390,6 +404,8 @@ aarch64*-*-*)
'$(srcdir)/config/aarch64/aarch64-builtins.cc'
'$(srcdir)/config/aarch64/aarch64-acle-builtins.h'
'$(srcdir)/config/aarch64/aarch64-sve-builtins.cc'
+ '$(srcdir)/config/aarch64/aarch64-neon-builtins.cc'
+ '$(srcdir)/config/aarch64/aarch64-neon-builtins.h'
)
target_gtfiles="${target_gtfiles[@]}"
target_has_targetm_common=yes
diff --git a/gcc/config/aarch64/aarch64-acle-builtins.h
b/gcc/config/aarch64/aarch64-acle-builtins.h
index 20152aaea6a2..f0511456313e 100644
--- a/gcc/config/aarch64/aarch64-acle-builtins.h
+++ b/gcc/config/aarch64/aarch64-acle-builtins.h
@@ -129,6 +129,7 @@ enum units_index
/* Enumerates the pragma handlers. */
enum handle_pragma_index
{
+ arm_neon_handle,
arm_sve_handle,
arm_sme_handle,
arm_neon_sve_handle,
@@ -187,6 +188,7 @@ enum type_class_index
TYPE_mfloat,
TYPE_signed,
TYPE_unsigned,
+ TYPE_poly,
NUM_TYPE_CLASSES
};
@@ -1172,6 +1174,830 @@ function_expander::result_mode () const
return TYPE_MODE (TREE_TYPE (TREE_TYPE (fndecl)));
}
+void build_all (function_builder &b, const char *signature,
+ const function_group_info &group,
+ mode_suffix_index mode_suffix_id,
+ bool force_direct_overloads = false);
+
+/* Define a TYPES_<combination> macro for each combination of type
+ suffixes that an ACLE function can have, where <combination> is the
+ name used in DEF_SVE_FUNCTION entries.
+
+ Use S (T) for single type suffix T and D (T1, T2) for a pair of type
+ suffixes T1 and T2. Use commas to separate the suffixes.
+
+ Although the order shouldn't matter, the convention is to sort the
+ suffixes lexicographically after dividing suffixes into a type
+ class ("b", "f", etc.) and a numerical bit count. */
+
+/* _b8 _b16 _b32 _b64. */
+#define TYPES_all_pred(S, D, T) \
+ S (b8), S (b16), S (b32), S (b64)
+
+/* _c8 _c16 _c32 _c64. */
+#define TYPES_all_count(S, D, T) \
+ S (c8), S (c16), S (c32), S (c64)
+
+/* _b8 _b16 _b32 _b64
+ _c8 _c16 _c32 _c64. */
+#define TYPES_all_pred_count(S, D, T) \
+ TYPES_all_pred (S, D, T), \
+ TYPES_all_count (S, D, T)
+
+/* _f16 _f32 _f64. */
+#define TYPES_all_float(S, D, T) \
+ S (f16), S (f32), S (f64)
+
+/* _s8 _s16 _s32 _s64. */
+#define TYPES_all_signed(S, D, T) \
+ S (s8), S (s16), S (s32), S (s64)
+
+/* _f16 _f32 _f64
+ _s8 _s16 _s32 _s64. */
+#define TYPES_all_float_and_signed(S, D, T) \
+ TYPES_all_float (S, D, T), TYPES_all_signed (S, D, T)
+
+/* _u8 _u16 _u32 _u64. */
+#define TYPES_all_unsigned(S, D, T) \
+ S (u8), S (u16), S (u32), S (u64)
+
+/* _s8 _s16 _s32 _s64
+ _u8 _u16 _u32 _u64. */
+#define TYPES_all_integer(S, D, T) \
+ TYPES_all_signed (S, D, T), TYPES_all_unsigned (S, D, T)
+
+/* _f16 _f32 _f64
+ _s8 _s16 _s32 _s64
+ _u8 _u16 _u32 _u64. */
+#define TYPES_all_arith(S, D, T) \
+ TYPES_all_float (S, D, T), TYPES_all_integer (S, D, T)
+
+/* _f32 _f64
+ _s8 _s16 _s32 _s64
+ _u8 _u16 _u32 _u64. */
+#define TYPES_all_arith_no_fp16(S, D, T) \
+ S (f32), S (f64), \
+ TYPES_all_integer (S, D, T)
+
+#define TYPES_all_data(S, D, T) \
+ TYPES_b_data (S, D, T), \
+ TYPES_h_data (S, D, T), \
+ TYPES_s_data (S, D, T), \
+ TYPES_d_data (S, D, T)
+
+/* _b only. */
+#define TYPES_b(S, D, T) \
+ S (b)
+
+/* _c only. */
+#define TYPES_c(S, D, T) \
+ S (c)
+
+/* _u8. */
+#define TYPES_b_unsigned(S, D, T) \
+ S (u8)
+
+/* _s8
+ _u8. */
+#define TYPES_b_integer(S, D, T) \
+ S (s8), TYPES_b_unsigned (S, D, T)
+
+/* _mf8
+ _s8
+ _u8. */
+#define TYPES_b_data(S, D, T) \
+ S (mf8), TYPES_b_integer (S, D, T)
+
+/* _s8 _s16
+ _u8 _u16. */
+#define TYPES_bh_integer(S, D, T) \
+ S (s8), S (s16), S (u8), S (u16)
+
+/* _u8 _u32. */
+#define TYPES_bs_unsigned(S, D, T) \
+ S (u8), S (u32)
+
+/* _s8 _s16 _s32. */
+#define TYPES_bhs_signed(S, D, T) \
+ S (s8), S (s16), S (s32)
+
+/* _u8 _u16 _u32. */
+#define TYPES_bhs_unsigned(S, D, T) \
+ S (u8), S (u16), S (u32)
+
+/* _s8 _s16 _s32
+ _u8 _u16 _u32. */
+#define TYPES_bhs_integer(S, D, T) \
+ TYPES_bhs_signed (S, D, T), TYPES_bhs_unsigned (S, D, T)
+
+#define TYPES_bh_data(S, D, T) \
+ TYPES_b_data (S, D, T), \
+ TYPES_h_data (S, D, T)
+
+#define TYPES_bhs_data(S, D, T) \
+ TYPES_b_data (S, D, T), \
+ TYPES_h_data (S, D, T), \
+ TYPES_s_data (S, D, T)
+
+/* _s16_s8 _s32_s16 _s64_s32
+ _u16_u8 _u32_u16 _u64_u32. */
+#define TYPES_bhs_widen(S, D, T) \
+ D (s16, s8), D (s32, s16), D (s64, s32), \
+ D (u16, u8), D (u32, u16), D (u64, u32)
+
+/* _bf16. */
+#define TYPES_h_bfloat(S, D, T) \
+ S (bf16)
+
+/* _f16. */
+#define TYPES_h_float(S, D, T) \
+ S (f16)
+
+/* _s16
+ _u16. */
+#define TYPES_h_integer(S, D, T) \
+ S (s16), S (u16)
+
+/* _bf16
+ _f16
+ _s16
+ _u16. */
+#define TYPES_h_data(S, D, T) \
+ S (bf16), S (f16), TYPES_h_integer (S, D, T)
+
+/* _s16 _s32. */
+#define TYPES_hs_signed(S, D, T) \
+ S (s16), S (s32)
+
+/* _s16 _s32
+ _u16 _u32. */
+#define TYPES_hs_integer(S, D, T) \
+ TYPES_hs_signed (S, D, T), S (u16), S (u32)
+
+/* _f16 _f32. */
+#define TYPES_hs_float(S, D, T) \
+ S (f16), S (f32)
+
+#define TYPES_hs_data(S, D, T) \
+ TYPES_h_data (S, D, T), \
+ TYPES_s_data (S, D, T)
+
+/* _u16 _u64. */
+#define TYPES_hd_unsigned(S, D, T) \
+ S (u16), S (u64)
+
+/* _s16 _s32 _s64. */
+#define TYPES_hsd_signed(S, D, T) \
+ S (s16), S (s32), S (s64)
+
+/* _s16 _s32 _s64
+ _u16 _u32 _u64. */
+#define TYPES_hsd_integer(S, D, T) \
+ TYPES_hsd_signed (S, D, T), S (u16), S (u32), S (u64)
+
+#define TYPES_hsd_data(S, D, T) \
+ TYPES_h_data (S, D, T), \
+ TYPES_s_data (S, D, T), \
+ TYPES_d_data (S, D, T)
+
+/* _f16_mf8. */
+#define TYPES_h_float_mf8(S, D, T) \
+ D (f16, mf8)
+
+/* _f32. */
+#define TYPES_s_float(S, D, T) \
+ S (f32)
+
+/* _f32_mf8. */
+#define TYPES_s_float_mf8(S, D, T) \
+ D (f32, mf8)
+
+/* _f32
+ _s16 _s32 _s64
+ _u16 _u32 _u64. */
+#define TYPES_s_float_hsd_integer(S, D, T) \
+ TYPES_s_float (S, D, T), TYPES_hsd_integer (S, D, T)
+
+/* _f32
+ _s32 _s64
+ _u32 _u64. */
+#define TYPES_s_float_sd_integer(S, D, T) \
+ TYPES_s_float (S, D, T), TYPES_sd_integer (S, D, T)
+
+/* _s32. */
+#define TYPES_s_signed(S, D, T) \
+ S (s32)
+
+/* _u32. */
+#define TYPES_s_unsigned(S, D, T) \
+ S (u32)
+
+/* _s32
+ _u32. */
+#define TYPES_s_integer(S, D, T) \
+ TYPES_s_signed (S, D, T), TYPES_s_unsigned (S, D, T)
+
+/* _f32
+ _s32
+ _u32. */
+#define TYPES_s_data(S, D, T) \
+ TYPES_s_float (S, D, T), TYPES_s_integer (S, D, T)
+
+/* _s32 _s64. */
+#define TYPES_sd_signed(S, D, T) \
+ S (s32), S (s64)
+
+/* _u32 _u64. */
+#define TYPES_sd_unsigned(S, D, T) \
+ S (u32), S (u64)
+
+/* _s32 _s64
+ _u32 _u64. */
+#define TYPES_sd_integer(S, D, T) \
+ TYPES_sd_signed (S, D, T), TYPES_sd_unsigned (S, D, T)
+
+#define TYPES_sd_data(S, D, T) \
+ TYPES_s_data (S, D, T), \
+ TYPES_d_data (S, D, T)
+
+/* _f16 _f32 _f64
+ _s32 _s64
+ _u32 _u64. */
+#define TYPES_all_float_and_sd_integer(S, D, T) \
+ TYPES_all_float (S, D, T), TYPES_sd_integer (S, D, T)
+
+/* _f64. */
+#define TYPES_d_float(S, D, T) \
+ S (f64)
+
+/* _u64. */
+#define TYPES_d_unsigned(S, D, T) \
+ S (u64)
+
+/* _s64
+ _u64. */
+#define TYPES_d_integer(S, D, T) \
+ S (s64), TYPES_d_unsigned (S, D, T)
+
+/* _f64
+ _s64
+ _u64. */
+#define TYPES_d_data(S, D, T) \
+ TYPES_d_float (S, D, T), TYPES_d_integer (S, D, T)
+
+/* All the type combinations allowed by svcvt. */
+#define TYPES_cvt(S, D, T) \
+ D (f16, f32), D (f16, f64), \
+ D (f16, s16), D (f16, s32), D (f16, s64), \
+ D (f16, u16), D (f16, u32), D (f16, u64), \
+ \
+ D (f32, f16), D (f32, f64), \
+ D (f32, s32), D (f32, s64), \
+ D (f32, u32), D (f32, u64), \
+ \
+ D (f64, f16), D (f64, f32), \
+ D (f64, s32), D (f64, s64), \
+ D (f64, u32), D (f64, u64), \
+ \
+ D (s16, f16), \
+ D (s32, f16), D (s32, f32), D (s32, f64), \
+ D (s64, f16), D (s64, f32), D (s64, f64), \
+ \
+ D (u16, f16), \
+ D (u32, f16), D (u32, f32), D (u32, f64), \
+ D (u64, f16), D (u64, f32), D (u64, f64)
+
+/* _bf16_f32. */
+#define TYPES_cvt_bfloat(S, D, T) \
+ D (bf16, f32)
+
+/* { _bf16 _f16 } x _f32. */
+#define TYPES_cvt_h_s_float(S, D, T) \
+ D (bf16, f32), D (f16, f32)
+
+/* _f32_f16. */
+#define TYPES_cvt_f32_f16(S, D, T) \
+ D (f32, f16)
+
+/* _f32_f16
+ _f64_f32. */
+#define TYPES_cvt_long(S, D, T) \
+ D (f32, f16), D (f64, f32)
+
+/* _f32_f64. */
+#define TYPES_cvt_narrow_s(S, D, T) \
+ D (f32, f64)
+
+/* _f16_f32
+ _f32_f64. */
+#define TYPES_cvt_narrow(S, D, T) \
+ D (f16, f32), TYPES_cvt_narrow_s (S, D, T)
+
+/* { _s32 _u32 } x _f32
+
+ _f32 x { _s32 _u32 }. */
+#define TYPES_cvt_s_s(S, D, T) \
+ D (s32, f32), \
+ D (u32, f32), \
+ D (f32, s32), \
+ D (f32, u32)
+
+/* _f16_mf8
+ _bf16_mf8. */
+#define TYPES_cvt_mf8(S, D, T) \
+ D (f16, mf8), D (bf16, mf8)
+
+/* _mf8_f16
+ _mf8_bf16. */
+#define TYPES_cvtn_mf8(S, D, T) \
+ D (mf8, f16), D (mf8, bf16)
+
+/* _mf8_f32. */
+#define TYPES_cvtnx_mf8(S, D, T) \
+ D (mf8, f32)
+
+/* { _s32 _s64 } x { _b8 _b16 _b32 _b64 }
+ { _u32 _u64 }. */
+#define TYPES_inc_dec_n1(D, A) \
+ D (A, b8), D (A, b16), D (A, b32), D (A, b64)
+#define TYPES_inc_dec_n(S, D, T) \
+ TYPES_inc_dec_n1 (D, s32), \
+ TYPES_inc_dec_n1 (D, s64), \
+ TYPES_inc_dec_n1 (D, u32), \
+ TYPES_inc_dec_n1 (D, u64)
+
+/* { _s16 _u16 } x _s32
+
+ { _u16 } x _u32. */
+#define TYPES_qcvt_x2(S, D, T) \
+ D (s16, s32), \
+ D (u16, u32), \
+ D (u16, s32)
+
+/* { _s8 _u8 } x _s32
+
+ { _u8 } x _u32
+
+ { _s16 _u16 } x _s64
+
+ { _u16 } x _u64. */
+#define TYPES_qcvt_x4(S, D, T) \
+ D (s8, s32), \
+ D (u8, u32), \
+ D (u8, s32), \
+ D (s16, s64), \
+ D (u16, u64), \
+ D (u16, s64)
+
+/* _s16_s32
+ _u16_u32. */
+#define TYPES_qrshr_x2(S, D, T) \
+ D (s16, s32), \
+ D (u16, u32)
+
+/* _u16_s32. */
+#define TYPES_qrshru_x2(S, D, T) \
+ D (u16, s32)
+
+/* _s8_s32
+ _s16_s64
+ _u8_u32
+ _u16_u64. */
+#define TYPES_qrshr_x4(S, D, T) \
+ D (s8, s32), \
+ D (s16, s64), \
+ D (u8, u32), \
+ D (u16, u64)
+
+/* _u8_s32
+ _u16_s64. */
+#define TYPES_qrshru_x4(S, D, T) \
+ D (u8, s32), \
+ D (u16, s64)
+
+/* { _mf8 _bf16 } { _mf8 _bf16 }
+ { _f16 _f32 _f64 } { _f16 _f32 _f64 }
+ { _s8 _s16 _s32 _s64 } x { _s8 _s16 _s32 _s64 }
+ { _u8 _u16 _u32 _u64 } { _u8 _u16 _u32 _u64 }. */
+#define TYPES_reinterpret1(D, A) \
+ D (A, mf8), \
+ D (A, bf16), \
+ D (A, f16), D (A, f32), D (A, f64), \
+ D (A, s8), D (A, s16), D (A, s32), D (A, s64), \
+ D (A, u8), D (A, u16), D (A, u32), D (A, u64)
+#define TYPES_reinterpret(S, D, T) \
+ TYPES_reinterpret1 (D, mf8), \
+ TYPES_reinterpret1 (D, bf16), \
+ TYPES_reinterpret1 (D, f16), \
+ TYPES_reinterpret1 (D, f32), \
+ TYPES_reinterpret1 (D, f64), \
+ TYPES_reinterpret1 (D, s8), \
+ TYPES_reinterpret1 (D, s16), \
+ TYPES_reinterpret1 (D, s32), \
+ TYPES_reinterpret1 (D, s64), \
+ TYPES_reinterpret1 (D, u8), \
+ TYPES_reinterpret1 (D, u16), \
+ TYPES_reinterpret1 (D, u32), \
+ TYPES_reinterpret1 (D, u64)
+
+/* _b_c
+ _c_b. */
+#define TYPES_reinterpret_b(S, D, T) \
+ D (b, c), \
+ D (c, b)
+
+/* { _b8 _b16 _b32 _b64 } x { _s32 _s64 }
+ { _u32 _u64 } */
+#define TYPES_while1(D, bn) \
+ D (bn, s32), D (bn, s64), D (bn, u32), D (bn, u64)
+#define TYPES_while(S, D, T) \
+ TYPES_while1 (D, b8), \
+ TYPES_while1 (D, b16), \
+ TYPES_while1 (D, b32), \
+ TYPES_while1 (D, b64)
+
+/* { _b8 _b16 _b32 _b64 } x { _s64 }
+ { _u64 } */
+#define TYPES_while_x(S, D, T) \
+ D (b8, s64), D (b8, u64), \
+ D (b16, s64), D (b16, u64), \
+ D (b32, s64), D (b32, u64), \
+ D (b64, s64), D (b64, u64)
+
+/* { _c8 _c16 _c32 _c64 } x { _s64 }
+ { _u64 } */
+#define TYPES_while_x_c(S, D, T) \
+ D (c8, s64), D (c8, u64), \
+ D (c16, s64), D (c16, u64), \
+ D (c32, s64), D (c32, u64), \
+ D (c64, s64), D (c64, u64)
+
+/* _f32_f16
+ _s32_s16
+ _u32_u16. */
+#define TYPES_s_narrow_fsu(S, D, T) \
+ D (f32, f16), D (s32, s16), D (u32, u16)
+
+/* _za8 _za16 _za32 _za64 _za128. */
+#define TYPES_all_za(S, D, T) \
+ S (za8), S (za16), S (za32), S (za64), S (za128)
+
+/* _za64. */
+#define TYPES_d_za(S, D, T) \
+ S (za64)
+
+/* { _za8 } x { _mf8 _s8 _u8 }
+ { _za16 } x { _bf16 _f16 _s16 _u16 }
+ { _za32 } x { _f32 _s32 _u32 }
+ { _za64 } x { _f64 _s64 _u64 }. */
+#define TYPES_za_bhsd_data(S, D, T) \
+ D (za8, mf8), D (za8, s8), D (za8, u8), \
+ D (za16, bf16), D (za16, f16), D (za16, s16), D (za16, u16), \
+ D (za32, f32), D (za32, s32), D (za32, u32), \
+ D (za64, f64), D (za64, s64), D (za64, u64)
+
+/* Likewise, plus:
+
+ { _za128 } x { _bf16 }
+ { _f16 _f32 _f64 }
+ { _s8 _s16 _s32 _s64 }
+ { _u8 _u16 _u32 _u64 }. */
+
+#define TYPES_za_all_data(S, D, T) \
+ TYPES_za_bhsd_data (S, D, T), \
+ TYPES_reinterpret1 (D, za128)
+
+/* _za16_mf8. */
+#define TYPES_za_h_mf8(S, D, T) \
+ D (za16, mf8)
+
+/* { _za_16 _za_32 } x _mf8. */
+#define TYPES_za_hs_mf8(S, D, T) \
+ D (za16, mf8), D (za32, mf8)
+
+/* _za16_bf16. */
+#define TYPES_za_h_bfloat(S, D, T) \
+ D (za16, bf16)
+
+/* _za16_f16. */
+#define TYPES_za_h_float(S, D, T) \
+ D (za16, f16)
+
+/* _za32_s8. */
+#define TYPES_za_s_b_signed(S, D, T) \
+ D (za32, s8)
+
+/* _za32_u8. */
+#define TYPES_za_s_b_unsigned(S, D, T) \
+ D (za32, u8)
+
+/* _za32 x { _s8 _u8 }. */
+#define TYPES_za_s_b_integer(S, D, T) \
+ D (za32, s8), D (za32, u8)
+
+/* _za32 x { _s16 _u16 }. */
+#define TYPES_za_s_h_integer(S, D, T) \
+ D (za32, s16), D (za32, u16)
+
+/* _za32 x { _bf16 _f16 _s16 _u16 }. */
+#define TYPES_za_s_h_data(S, D, T) \
+ D (za32, bf16), D (za32, f16), D (za32, s16), D (za32, u16)
+
+/* _za32_u32. */
+#define TYPES_za_s_unsigned(S, D, T) \
+ D (za32, u32)
+
+/* _za32 x { _s32 _u32 }. */
+#define TYPES_za_s_integer(S, D, T) \
+ D (za32, s32), D (za32, u32)
+
+/* _za32_mf8. */
+#define TYPES_za_s_mf8(S, D, T) \
+ D (za32, mf8)
+
+/* _za32_f32. */
+#define TYPES_za_s_float(S, D, T) \
+ D (za32, f32)
+
+/* _za32 x { _f32 _s32 _u32 }. */
+#define TYPES_za_s_data(S, D, T) \
+ D (za32, f32), D (za32, s32), D (za32, u32)
+
+/* _za64 x { _s16 _u16 }. */
+#define TYPES_za_d_h_integer(S, D, T) \
+ D (za64, s16), D (za64, u16)
+
+/* _za64_f64. */
+#define TYPES_za_d_float(S, D, T) \
+ D (za64, f64)
+
+/* _za64 x { _s64 _u64 }. */
+#define TYPES_za_d_integer(S, D, T) \
+ D (za64, s64), D (za64, u64)
+
+/* _za32 x { _s8 _u8 _bf16 _f16 _f32 }. */
+#define TYPES_mop_base(S, D, T) \
+ D (za32, s8), D (za32, u8), D (za32, bf16), D (za32, f16), D (za32, f32)
+
+/* _za32_s8. */
+#define TYPES_mop_base_signed(S, D, T) \
+ D (za32, s8)
+
+/* _za32_u8. */
+#define TYPES_mop_base_unsigned(S, D, T) \
+ D (za32, u8)
+
+/* _za64 x { _s16 _u16 }. */
+#define TYPES_mop_i16i64(S, D, T) \
+ D (za64, s16), D (za64, u16)
+
+/* _za64_s16. */
+#define TYPES_mop_i16i64_signed(S, D, T) \
+ D (za64, s16)
+
+/* _za64_u16. */
+#define TYPES_mop_i16i64_unsigned(S, D, T) \
+ D (za64, u16)
+
+/* _za. */
+#define TYPES_za(S, D, T) \
+ S (za)
+
+/* _p8 _p16 _p64. */
+#define TYPES_bhd_poly(S, D, T) \
+ S (p8), S (p16), S (p64)
+
+/* _p8 _p16 _p64 _p128. */
+#define TYPES_bhdq_poly(S, D, T) \
+ S (p8), S (p16), S (p64), S (p128)
+
+/* Describe a tuple of type suffixes in which only the first is used. */
+#define DEF_VECTOR_TYPE(X) \
+ { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }
+
+/* Describe a tuple of type suffixes in which only the first two are used. */
+#define DEF_DOUBLE_TYPE(X, Y) \
+ { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y, NUM_TYPE_SUFFIXES }
+
+/* Describe a tuple of type suffixes in which three elements are used. */
+#define DEF_TRIPLE_TYPE(X, Y, Z) \
+ { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y, TYPE_SUFFIX_ ## Z }
+
+/* Create an array that can be used in aarch64-sve-builtins.def to
+ select the type suffixes in TYPES_<NAME>. */
+#define DEF_SVE_TYPES_ARRAY(NAME) \
+ static const type_suffix_triple types_##NAME[] = { \
+ TYPES_##NAME (DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE, DEF_TRIPLE_TYPE), \
+ { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } \
+ }
+
+/* For functions that don't take any type suffixes. */
+static const type_suffix_triple types_none[] = {
+ { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES },
+ { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }
+};
+
+/* Create an array for each TYPES_<combination> macro above. */
+DEF_SVE_TYPES_ARRAY (all_pred);
+DEF_SVE_TYPES_ARRAY (all_count);
+DEF_SVE_TYPES_ARRAY (all_pred_count);
+DEF_SVE_TYPES_ARRAY (all_float);
+DEF_SVE_TYPES_ARRAY (all_signed);
+DEF_SVE_TYPES_ARRAY (all_float_and_signed);
+DEF_SVE_TYPES_ARRAY (all_unsigned);
+DEF_SVE_TYPES_ARRAY (all_integer);
+DEF_SVE_TYPES_ARRAY (all_arith);
+DEF_SVE_TYPES_ARRAY (all_arith_no_fp16);
+DEF_SVE_TYPES_ARRAY (all_data);
+DEF_SVE_TYPES_ARRAY (b);
+DEF_SVE_TYPES_ARRAY (b_unsigned);
+DEF_SVE_TYPES_ARRAY (b_integer);
+DEF_SVE_TYPES_ARRAY (bh_integer);
+DEF_SVE_TYPES_ARRAY (bs_unsigned);
+DEF_SVE_TYPES_ARRAY (bhs_signed);
+DEF_SVE_TYPES_ARRAY (bhs_unsigned);
+DEF_SVE_TYPES_ARRAY (bhs_integer);
+DEF_SVE_TYPES_ARRAY (bh_data);
+DEF_SVE_TYPES_ARRAY (bhs_data);
+DEF_SVE_TYPES_ARRAY (bhs_widen);
+DEF_SVE_TYPES_ARRAY (c);
+DEF_SVE_TYPES_ARRAY (h_bfloat);
+DEF_SVE_TYPES_ARRAY (h_float);
+DEF_SVE_TYPES_ARRAY (h_float_mf8);
+DEF_SVE_TYPES_ARRAY (h_integer);
+DEF_SVE_TYPES_ARRAY (h_data);
+DEF_SVE_TYPES_ARRAY (hs_signed);
+DEF_SVE_TYPES_ARRAY (hs_integer);
+DEF_SVE_TYPES_ARRAY (hs_float);
+DEF_SVE_TYPES_ARRAY (hs_data);
+DEF_SVE_TYPES_ARRAY (hd_unsigned);
+DEF_SVE_TYPES_ARRAY (hsd_signed);
+DEF_SVE_TYPES_ARRAY (hsd_integer);
+DEF_SVE_TYPES_ARRAY (hsd_data);
+DEF_SVE_TYPES_ARRAY (s_float);
+DEF_SVE_TYPES_ARRAY (s_float_hsd_integer);
+DEF_SVE_TYPES_ARRAY (s_float_mf8);
+DEF_SVE_TYPES_ARRAY (s_float_sd_integer);
+DEF_SVE_TYPES_ARRAY (s_signed);
+DEF_SVE_TYPES_ARRAY (s_unsigned);
+DEF_SVE_TYPES_ARRAY (s_integer);
+DEF_SVE_TYPES_ARRAY (s_data);
+DEF_SVE_TYPES_ARRAY (sd_signed);
+DEF_SVE_TYPES_ARRAY (sd_unsigned);
+DEF_SVE_TYPES_ARRAY (sd_integer);
+DEF_SVE_TYPES_ARRAY (sd_data);
+DEF_SVE_TYPES_ARRAY (all_float_and_sd_integer);
+DEF_SVE_TYPES_ARRAY (d_float);
+DEF_SVE_TYPES_ARRAY (d_unsigned);
+DEF_SVE_TYPES_ARRAY (d_integer);
+DEF_SVE_TYPES_ARRAY (d_data);
+DEF_SVE_TYPES_ARRAY (cvt);
+DEF_SVE_TYPES_ARRAY (cvt_bfloat);
+DEF_SVE_TYPES_ARRAY (cvt_h_s_float);
+DEF_SVE_TYPES_ARRAY (cvt_f32_f16);
+DEF_SVE_TYPES_ARRAY (cvt_long);
+DEF_SVE_TYPES_ARRAY (cvt_mf8);
+DEF_SVE_TYPES_ARRAY (cvt_narrow_s);
+DEF_SVE_TYPES_ARRAY (cvt_narrow);
+DEF_SVE_TYPES_ARRAY (cvt_s_s);
+DEF_SVE_TYPES_ARRAY (cvtn_mf8);
+DEF_SVE_TYPES_ARRAY (cvtnx_mf8);
+DEF_SVE_TYPES_ARRAY (inc_dec_n);
+DEF_SVE_TYPES_ARRAY (qcvt_x2);
+DEF_SVE_TYPES_ARRAY (qcvt_x4);
+DEF_SVE_TYPES_ARRAY (qrshr_x2);
+DEF_SVE_TYPES_ARRAY (qrshr_x4);
+DEF_SVE_TYPES_ARRAY (qrshru_x2);
+DEF_SVE_TYPES_ARRAY (qrshru_x4);
+DEF_SVE_TYPES_ARRAY (reinterpret);
+DEF_SVE_TYPES_ARRAY (reinterpret_b);
+DEF_SVE_TYPES_ARRAY (while);
+DEF_SVE_TYPES_ARRAY (while_x);
+DEF_SVE_TYPES_ARRAY (while_x_c);
+DEF_SVE_TYPES_ARRAY (s_narrow_fsu);
+DEF_SVE_TYPES_ARRAY (all_za);
+DEF_SVE_TYPES_ARRAY (d_za);
+DEF_SVE_TYPES_ARRAY (za_bhsd_data);
+DEF_SVE_TYPES_ARRAY (za_all_data);
+DEF_SVE_TYPES_ARRAY (za_h_mf8);
+DEF_SVE_TYPES_ARRAY (za_h_bfloat);
+DEF_SVE_TYPES_ARRAY (za_h_float);
+DEF_SVE_TYPES_ARRAY (za_s_b_signed);
+DEF_SVE_TYPES_ARRAY (za_s_b_unsigned);
+DEF_SVE_TYPES_ARRAY (za_s_b_integer);
+DEF_SVE_TYPES_ARRAY (za_s_h_integer);
+DEF_SVE_TYPES_ARRAY (za_s_h_data);
+DEF_SVE_TYPES_ARRAY (za_s_unsigned);
+DEF_SVE_TYPES_ARRAY (za_s_integer);
+DEF_SVE_TYPES_ARRAY (za_s_mf8);
+DEF_SVE_TYPES_ARRAY (za_hs_mf8);
+DEF_SVE_TYPES_ARRAY (za_s_float);
+DEF_SVE_TYPES_ARRAY (za_s_data);
+DEF_SVE_TYPES_ARRAY (za_d_h_integer);
+DEF_SVE_TYPES_ARRAY (za_d_float);
+DEF_SVE_TYPES_ARRAY (za_d_integer);
+DEF_SVE_TYPES_ARRAY (mop_base);
+DEF_SVE_TYPES_ARRAY (mop_base_signed);
+DEF_SVE_TYPES_ARRAY (mop_base_unsigned);
+DEF_SVE_TYPES_ARRAY (mop_i16i64);
+DEF_SVE_TYPES_ARRAY (mop_i16i64_signed);
+DEF_SVE_TYPES_ARRAY (mop_i16i64_unsigned);
+DEF_SVE_TYPES_ARRAY (za);
+
+DEF_SVE_TYPES_ARRAY (bhd_poly);
+DEF_SVE_TYPES_ARRAY (bhdq_poly);
+
+static const group_suffix_index groups_none[] = {
+ GROUP_none, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_x2[] = { GROUP_x2, NUM_GROUP_SUFFIXES };
+
+static const group_suffix_index groups_x12[] = {
+ GROUP_none, GROUP_x2, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_x4[] = { GROUP_x4, NUM_GROUP_SUFFIXES };
+
+static const group_suffix_index groups_x24[] = {
+ GROUP_x2, GROUP_x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_x124[] = {
+ GROUP_none, GROUP_x2, GROUP_x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_x1234[] = {
+ GROUP_none, GROUP_x2, GROUP_x3, GROUP_x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg1x2[] = {
+ GROUP_vg1x2, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg1x4[] = {
+ GROUP_vg1x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg1x24[] = {
+ GROUP_vg1x2, GROUP_vg1x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg2[] = {
+ GROUP_vg2x1, GROUP_vg2x2, GROUP_vg2x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg4[] = {
+ GROUP_vg4x1, GROUP_vg4x2, GROUP_vg4x4, NUM_GROUP_SUFFIXES
+};
+
+static const group_suffix_index groups_vg24[] = {
+ GROUP_vg2, GROUP_vg4, NUM_GROUP_SUFFIXES
+};
+
+/* Used by functions that have no governing predicate. */
+static const predication_index preds_none[] = { PRED_none, NUM_PREDS };
+
+/* Used by functions that have a governing predicate but do not have an
+ explicit suffix. */
+static const predication_index preds_implicit[] = { PRED_implicit, NUM_PREDS };
+
+/* Used by functions that only support "_m" predication. */
+static const predication_index preds_m[] = { PRED_m, NUM_PREDS };
+
+/* Used by functions that allow merging and "don't care" predication,
+ but are not suitable for predicated MOVPRFX. */
+static const predication_index preds_mx[] = {
+ PRED_m, PRED_x, NUM_PREDS
+};
+
+/* Used by functions that allow merging, zeroing and "don't care"
+ predication. */
+static const predication_index preds_mxz[] = {
+ PRED_m, PRED_x, PRED_z, NUM_PREDS
+};
+
+/* Used by functions that have the mxz predicated forms above, and in addition
+ have an unpredicated form. */
+static const predication_index preds_mxz_or_none[] = {
+ PRED_m, PRED_x, PRED_z, PRED_none, NUM_PREDS
+};
+
+/* Used by functions that allow merging and zeroing predication but have
+ no "_x" form. */
+static const predication_index preds_mz[] = { PRED_m, PRED_z, NUM_PREDS };
+
+/* Used by functions that have an unpredicated form and a _z predicated
+ form. */
+static const predication_index preds_z_or_none[] = {
+ PRED_z, PRED_none, NUM_PREDS
+};
+
+/* Used by (mostly predicate) functions that only support "_z" predication. */
+static const predication_index preds_z[] = { PRED_z, NUM_PREDS };
+
+/* Used by SME instructions that always merge into ZA. */
+static const predication_index preds_za_m[] = { PRED_za_m, NUM_PREDS };
}
#endif
diff --git a/gcc/config/aarch64/aarch64-builtins.cc
b/gcc/config/aarch64/aarch64-builtins.cc
index 611f6dc45e0a..e9e237f65aae 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -52,6 +52,7 @@
#include "tree-pass.h"
#include "tree-vector-builder.h"
#include "aarch64-builtins.h"
+#include "aarch64-neon-builtins.h"
using namespace aarch64;
@@ -1938,12 +1939,11 @@ aarch64_target_switcher::~aarch64_target_switcher ()
sizeof (have_regs_of_mode));
}
-/* Implement #pragma GCC aarch64 "arm_neon.h".
-
- The types and functions defined here need to be available internally
- during LTO as well. */
+/* Initialize NEON builtins using the old framework.
+ Delete once NEON all intrinsics have been ported to the pragma-based
+ framework. */
void
-handle_arm_neon_h (void)
+init_arm_neon_builtins (void)
{
aarch64_target_switcher switcher (AARCH64_FL_SIMD);
@@ -1971,7 +1971,7 @@ aarch64_init_simd_builtins (void)
aarch64_init_simd_builtin_functions (false);
if (in_lto_p)
- handle_arm_neon_h ();
+ init_arm_neon_builtins ();
/* Initialize the remaining fcmla_laneq intrinsics. */
aarch64_init_fcmla_laneq_builtins ();
diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index ef2475154e85..85842152862b 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -32,6 +32,7 @@
#include "c-family/c-pragma.h"
#include "langhooks.h"
#include "target.h"
+#include "aarch64-neon-builtins.h"
#define builtin_define(TXT) cpp_define (pfile, TXT)
@@ -409,7 +410,7 @@ aarch64_pragma_aarch64 (cpp_reader *)
else if (strcmp (name, "arm_sme.h") == 0)
aarch64_acle::handle_arm_sme_h (false);
else if (strcmp (name, "arm_neon.h") == 0)
- handle_arm_neon_h ();
+ aarch64_acle::handle_arm_neon_h (false);
else if (strcmp (name, "arm_acle.h") == 0)
handle_arm_acle_h ();
else if (strcmp (name, "arm_neon_sve_bridge.h") == 0)
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-base.cc
b/gcc/config/aarch64/aarch64-neon-builtins-base.cc
new file mode 100644
index 000000000000..4c3c33c56629
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-base.cc
@@ -0,0 +1,113 @@
+/* ACLE support for AArch64 NEON (__ARM_FEATURE_SIMD intrinsics)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "memmodel.h"
+#include "insn-codes.h"
+#include "optabs.h"
+#include "recog.h"
+#include "expr.h"
+#include "basic-block.h"
+#include "function.h"
+#include "fold-const.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimplify.h"
+#include "explow.h"
+#include "tree-vector-builder.h"
+#include "rtx-vector-builder.h"
+#include "vec-perm-indices.h"
+#include "aarch64-acle-builtins.h"
+#include "aarch64-neon-builtins-base.h"
+#include "aarch64-neon-builtins-functions.h"
+#include "aarch64-neon-builtins.h"
+#include "gimple-fold.h"
+
+using namespace aarch64_acle;
+
+/* Base class for all function expanders.
+ At least one of `expand` or `fold` must be overriden by derived classes. */
+class gimple_function_base : public function_base
+{
+ rtx expand (function_expander &) const override { gcc_unreachable (); }
+ gimple *fold (gimple_folder &) const override { gcc_unreachable (); }
+};
+
+/* For intrinsics that map to a single GIMPLE expression with no argument
+ preparation necessary. */
+class gimple_expr : public gimple_function_base
+{
+ tree_code m_int_code;
+ tree_code m_float_code;
+ tree_code m_poly_code;
+
+public:
+ constexpr gimple_expr (tree_code code)
+ : m_int_code (code),
+ m_float_code (code),
+ m_poly_code (code)
+ {}
+
+ constexpr gimple_expr (tree_code int_code,
+ tree_code float_code,
+ tree_code poly_code)
+ : m_int_code (int_code),
+ m_float_code (float_code),
+ m_poly_code (poly_code)
+ {}
+
+ gimple *fold (gimple_folder &f) const override
+ {
+ auto nargs = gimple_call_num_args (f.call);
+ auto arg0 = nargs >= 1 ? gimple_call_arg (f.call, 0) : nullptr;
+ auto arg1 = nargs >= 2 ? gimple_call_arg (f.call, 1) : nullptr;
+ auto arg2 = nargs >= 3 ? gimple_call_arg (f.call, 2) : nullptr;
+
+ tree_code code;
+ auto type_class = f.type_suffix (0).tclass;
+ switch (type_class)
+ {
+ case TYPE_signed:
+ case TYPE_unsigned:
+ code = m_int_code;
+ break;
+ case TYPE_float:
+ code = m_float_code;
+ break;
+ case TYPE_poly:
+ code = m_poly_code;
+ break;
+ default:
+ gcc_unreachable ();
+ }
+
+ return gimple_build_assign (f.lhs, code, arg0, arg1, arg2);
+ }
+};
+
+// Lanewise arithmetic
+NEON_FUNCTION (vaddd, gimple_expr, (PLUS_EXPR))
+NEON_FUNCTION (vadd, gimple_expr, (PLUS_EXPR, PLUS_EXPR, BIT_XOR_EXPR))
+NEON_FUNCTION (vaddq, gimple_expr, (PLUS_EXPR, PLUS_EXPR, BIT_XOR_EXPR))
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-base.def
b/gcc/config/aarch64/aarch64-neon-builtins-base.def
new file mode 100644
index 000000000000..c8077d96a7dd
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-base.def
@@ -0,0 +1,33 @@
+/* ACLE support for AArch64 NEON (__ARM_FEATURE_SIMD intrinsics)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+// Lanewise arithmetic
+#define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_SIMD)
+DEF_NEON_FUNCTION (vaddd, d_integer, ("s0,s0,s0"))
+DEF_NEON_FUNCTION (vadd, all_arith_no_fp16, ("D0,D0,D0"))
+DEF_NEON_FUNCTION (vadd, bhd_poly, ("D0,D0,D0"))
+DEF_NEON_FUNCTION (vaddq, all_arith_no_fp16, ("Q0,Q0,Q0"))
+DEF_NEON_FUNCTION (vaddq, bhdq_poly, ("Q0,Q0,Q0"))
+#undef REQUIRED_EXTENSIONS
+
+// Lanewise arithmetic (FP16)
+#define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_F16)
+DEF_NEON_FUNCTION (vadd, h_float, ("D0,D0,D0"))
+DEF_NEON_FUNCTION (vaddq, h_float, ("Q0,Q0,Q0"))
+#undef REQUIRED_EXTENSIONS
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-base.h
b/gcc/config/aarch64/aarch64-neon-builtins-base.h
new file mode 100644
index 000000000000..9612bef42f26
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-base.h
@@ -0,0 +1,29 @@
+/* ACLE support for AArch64 NEON (__ARM_FEATURE_SIMD intrinsics)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_AARCH64_NEON_BUILTINS_BASE_H
+#define GCC_AARCH64_NEON_BUILTINS_BASE_H
+
+namespace aarch64_acle::functions {
+#define DEF_NEON_FUNCTION(NAME, ...) \
+ extern const aarch64_acle::function_base *const NAME;
+#include "aarch64-neon-builtins.def"
+}
+
+#endif
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-functions.h
b/gcc/config/aarch64/aarch64-neon-builtins-functions.h
new file mode 100644
index 000000000000..58a631ac54e0
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-functions.h
@@ -0,0 +1,29 @@
+/* ACLE support for AArch64 NEON (function_base classes)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_AARCH64_NEON_BUILTINS_FUNCTIONS_H
+#define GCC_AARCH64_NEON_BUILTINS_FUNCTIONS_H
+
+/* Declare the global function base NAME, creating it from an instance
+ of class CLASS with constructor arguments ARGS. */
+#define NEON_FUNCTION(NAME, CLASS, ARGS) \
+ namespace { static constexpr const CLASS NAME##_obj ARGS; } \
+ const function_base *const aarch64_acle::functions::NAME = &NAME##_obj;
+
+#endif
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-shapes.cc
b/gcc/config/aarch64/aarch64-neon-builtins-shapes.cc
new file mode 100644
index 000000000000..7946a7675eb5
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-shapes.cc
@@ -0,0 +1,69 @@
+/* ACLE support for AArch64 NEON (function shapes)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#define INCLUDE_ALGORITHM
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "basic-block.h"
+#include "tree.h"
+#include "function.h"
+#include "gimple.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "memmodel.h"
+#include "insn-codes.h"
+#include "optabs.h"
+#include "aarch64-acle-builtins.h"
+#include "aarch64-sve-builtins-shapes.h"
+#include "aarch64-builtins.h"
+
+using namespace aarch64_acle;
+
+/* All NEON functions are non-overloaded, so we don't need bespoke
+ function shapes. Instead, we can just use a single shape for all NEON
+ functions, parameterised by a signature. */
+struct neon_shape : public function_shape
+{
+ constexpr neon_shape (const char *signature)
+ : m_signature (signature)
+ {}
+
+ const char *m_signature;
+
+ void build (function_builder &b,
+ const function_group_info &group) const override
+ {
+ aarch64_acle::build_all (b, this->m_signature, group, MODE_none);
+ }
+
+ bool check (function_checker &) const override { return true; }
+
+ bool explicit_type_suffix_p (unsigned int) const override { return true; }
+ tree resolve (function_resolver &) const override { gcc_unreachable (); }
+};
+
+namespace aarch64_acle::shapes {
+#define DEF_NEON_FUNCTION(NAME, TYPES, SHAPE_ARGS) \
+ static constexpr const neon_shape OBJ_NAME (NAME, TYPES) SHAPE_ARGS; \
+ const aarch64_acle::function_shape *SHAPE_NAME (NAME, TYPES) \
+ = &OBJ_NAME (NAME, TYPES);
+#include "aarch64-neon-builtins.def"
+}
diff --git a/gcc/config/aarch64/aarch64-neon-builtins-shapes.h
b/gcc/config/aarch64/aarch64-neon-builtins-shapes.h
new file mode 100644
index 000000000000..c94f4c994643
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins-shapes.h
@@ -0,0 +1,29 @@
+/* ACLE support for AArch64 NEON (function shapes)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_AARCH64_NEON_BUILTINS_SHAPES_H
+#define GCC_AARCH64_NEON_BUILTINS_SHAPES_H
+
+namespace aarch64_acle::shapes {
+#define DEF_NEON_FUNCTION(NAME, TYPES, SHAPE_ARGS) \
+ extern const aarch64_acle::function_shape *const SHAPE_NAME (NAME, TYPES);
+#include "aarch64-neon-builtins.def"
+}
+
+#endif
diff --git a/gcc/config/aarch64/aarch64-neon-builtins.cc
b/gcc/config/aarch64/aarch64-neon-builtins.cc
new file mode 100644
index 000000000000..7159b265ec9c
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins.cc
@@ -0,0 +1,86 @@
+/* ACLE support for AArch64 NEON
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#define IN_TARGET_CODE 1
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "rtl.h"
+#include "tm_p.h"
+#include "memmodel.h"
+#include "insn-codes.h"
+#include "optabs.h"
+#include "diagnostic.h"
+#include "expr.h"
+#include "basic-block.h"
+#include "function.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimplify.h"
+#include "explow.h"
+#include "aarch64-acle-builtins.h"
+#include "aarch64-sve-builtins-base.h"
+#include "aarch64-sve-builtins-shapes.h"
+#include "aarch64-neon-builtins-shapes.h"
+#include "aarch64-neon-builtins-functions.h"
+#include "aarch64-neon-builtins-base.h"
+#include "aarch64-builtins.h"
+
+/* Implement `#pragma GCC aarch64 "arm_neon"`. */
+namespace aarch64_acle {
+constexpr const aarch64_acle::function_group_info neon_function_groups[] = {
+#define DEF_NEON_FUNCTION(NAME, TYPES, SHAPE_ARGS) \
+ { \
+ /* .base_name = */ #NAME, \
+ /* .base = */ &aarch64_acle::functions::NAME,
\
+ /* .shape = */ &aarch64_acle::shapes::SHAPE_NAME (NAME, TYPES),\
+ /* .types = */ aarch64_acle::types_##TYPES, \
+ /* .groups = */ aarch64_acle::groups_none, \
+ /* .preds = */ aarch64_acle::preds_none, \
+ /* .extensions = */ aarch64_required_extensions::REQUIRED_EXTENSIONS,\
+ /* .fpm_mode = */ aarch64_acle::FPM_unused, \
+ },
+#include "aarch64-neon-builtins.def"
+};
+
+bool arm_neon_h_handled = false;
+
+void
+handle_arm_neon_h (bool function_nulls_p)
+{
+ if (arm_neon_h_handled)
+ return;
+
+ /* FIXME: Remove once all NEON intrinsics have been ported to the
pragma-based
+ framework. */
+ init_arm_neon_builtins ();
+
+ aarch64_target_switcher switcher;
+ aarch64_acle::function_builder builder (aarch64_acle::arm_neon_handle,
+ function_nulls_p);
+
+ for (auto &group : neon_function_groups)
+ builder.register_function_group (group);
+
+ arm_neon_h_handled = true;
+}
+};
diff --git a/gcc/config/aarch64/aarch64-neon-builtins.def
b/gcc/config/aarch64/aarch64-neon-builtins.def
new file mode 100644
index 000000000000..58630eebe489
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins.def
@@ -0,0 +1,40 @@
+/* ACLE support for AArch64 NEON (__ARM_FEATURE_SIMD intrinsics)
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+/* Code organization: See block comment at the top of
+ aarch64-sve-builtins.def. */
+
+/* Define a new function group. */
+#ifndef DEF_NEON_FUNCTION
+#define DEF_NEON_FUNCTION(NAME, TYPES, SHAPE_ARGS)
+#endif
+
+/* Helper for generating the name of the function_group's corresponding
+ neon_shape instance. */
+#define OBJ_NAME(NAME, TYPES) NAME ## _ ## TYPES ## _obj
+
+/* Helper for generating the name of the function_group's corresponding
+ function_shape. */
+#define SHAPE_NAME(NAME, TYPES) NAME ## _ ## TYPES ## _shape
+
+#include "aarch64-neon-builtins-base.def"
+
+#undef DEF_NEON_FUNCTION
+#undef OBJ_NAME
+#undef SHAPE_NAME
diff --git a/gcc/config/aarch64/aarch64-neon-builtins.h
b/gcc/config/aarch64/aarch64-neon-builtins.h
new file mode 100644
index 000000000000..c3bf53674cbc
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-neon-builtins.h
@@ -0,0 +1,28 @@
+/* ACLE support for AArch64 NEON
+ Copyright (C) 2026-2026 Free Software Foundation, Inc.
+
+ This file is part of GCC.
+
+ GCC is free software; you can redistribute it and/or modify it
+ under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 3, or (at your option)
+ any later version.
+
+ GCC is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with GCC; see the file COPYING3. If not see
+ <http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_AARCH64_NEON_BUILTINS_H
+#define GCC_AARCH64_NEON_BUILTINS_H
+
+namespace aarch64_acle {
+extern bool arm_neon_h_handled;
+void handle_arm_neon_h (bool);
+};
+
+#endif
diff --git a/gcc/config/aarch64/aarch64-protos.h
b/gcc/config/aarch64/aarch64-protos.h
index b794cd7de664..1610f54986e6 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1155,7 +1155,7 @@ tree aarch64_general_builtin_decl (unsigned, bool);
tree aarch64_general_builtin_rsqrt (unsigned int);
void aarch64_ms_variadic_abi_init_builtins (void);
void handle_arm_acle_h (void);
-void handle_arm_neon_h (void);
+void init_arm_neon_builtins (void);
bool aarch64_check_required_extensions (location_t, tree,
aarch64_required_extensions);
diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
index 5fb65dd8a319..b51115359eaf 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc
@@ -277,6 +277,18 @@ parse_type (const function_instance &instance, const char
*&format)
if (ch == 's')
{
type_suffix_index suffix = parse_element_type (instance, format);
+
+ // HACK: remove once all NEON intrinsics have been ported to the
+ // pragma-based framework.
+ if (suffix == TYPE_SUFFIX_p8)
+ return aarch64_simd_types_trees[Poly8_t].eltype;
+ if (suffix == TYPE_SUFFIX_p16)
+ return aarch64_simd_types_trees[Poly16_t].eltype;
+ if (suffix == TYPE_SUFFIX_p64)
+ return aarch64_simd_types_trees[Poly64_t].eltype;
+ if (suffix == TYPE_SUFFIX_p128)
+ return aarch64_simd_types_trees[Poly128_t].eltype;
+
return scalar_types[type_suffixes[suffix].vector_type];
}
@@ -530,10 +542,10 @@ build_vs_offset (function_builder &b, const char
*signature,
predicate. FORCE_DIRECT_OVERLOADS is true if there is a one-to-one
mapping between "short" and "full" names, and if standard overload
resolution therefore isn't necessary. */
-static void
+void
build_all (function_builder &b, const char *signature,
const function_group_info &group, mode_suffix_index mode_suffix_id,
- bool force_direct_overloads = false)
+ bool force_direct_overloads)
{
for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
for (unsigned int gi = 0; group.groups[gi] != NUM_GROUP_SUFFIXES; ++gi)
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index da96da69f273..6f5244ae81d2 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -55,6 +55,7 @@
#include "aarch64-sve-builtins-sme.h"
#include "aarch64-sve-builtins-shapes.h"
#include "aarch64-builtins.h"
+#include "aarch64-neon-builtins.h"
using namespace aarch64;
@@ -180,814 +181,6 @@ CONSTEXPR const group_suffix_info group_suffixes[] = {
{ "", 0, 1 }
};
-/* Define a TYPES_<combination> macro for each combination of type
- suffixes that an ACLE function can have, where <combination> is the
- name used in DEF_SVE_FUNCTION entries.
-
- Use S (T) for single type suffix T and D (T1, T2) for a pair of type
- suffixes T1 and T2. Use commas to separate the suffixes.
-
- Although the order shouldn't matter, the convention is to sort the
- suffixes lexicographically after dividing suffixes into a type
- class ("b", "f", etc.) and a numerical bit count. */
-
-/* _b8 _b16 _b32 _b64. */
-#define TYPES_all_pred(S, D, T) \
- S (b8), S (b16), S (b32), S (b64)
-
-/* _c8 _c16 _c32 _c64. */
-#define TYPES_all_count(S, D, T) \
- S (c8), S (c16), S (c32), S (c64)
-
-/* _b8 _b16 _b32 _b64
- _c8 _c16 _c32 _c64. */
-#define TYPES_all_pred_count(S, D, T) \
- TYPES_all_pred (S, D, T), \
- TYPES_all_count (S, D, T)
-
-/* _f16 _f32 _f64. */
-#define TYPES_all_float(S, D, T) \
- S (f16), S (f32), S (f64)
-
-/* _s8 _s16 _s32 _s64. */
-#define TYPES_all_signed(S, D, T) \
- S (s8), S (s16), S (s32), S (s64)
-
-/* _f16 _f32 _f64
- _s8 _s16 _s32 _s64. */
-#define TYPES_all_float_and_signed(S, D, T) \
- TYPES_all_float (S, D, T), TYPES_all_signed (S, D, T)
-
-/* _u8 _u16 _u32 _u64. */
-#define TYPES_all_unsigned(S, D, T) \
- S (u8), S (u16), S (u32), S (u64)
-
-/* _s8 _s16 _s32 _s64
- _u8 _u16 _u32 _u64. */
-#define TYPES_all_integer(S, D, T) \
- TYPES_all_signed (S, D, T), TYPES_all_unsigned (S, D, T)
-
-/* _f16 _f32 _f64
- _s8 _s16 _s32 _s64
- _u8 _u16 _u32 _u64. */
-#define TYPES_all_arith(S, D, T) \
- TYPES_all_float (S, D, T), TYPES_all_integer (S, D, T)
-
-#define TYPES_all_data(S, D, T) \
- TYPES_b_data (S, D, T), \
- TYPES_h_data (S, D, T), \
- TYPES_s_data (S, D, T), \
- TYPES_d_data (S, D, T)
-
-/* _b only. */
-#define TYPES_b(S, D, T) \
- S (b)
-
-/* _c only. */
-#define TYPES_c(S, D, T) \
- S (c)
-
-/* _u8. */
-#define TYPES_b_unsigned(S, D, T) \
- S (u8)
-
-/* _s8
- _u8. */
-#define TYPES_b_integer(S, D, T) \
- S (s8), TYPES_b_unsigned (S, D, T)
-
-/* _mf8
- _s8
- _u8. */
-#define TYPES_b_data(S, D, T) \
- S (mf8), TYPES_b_integer (S, D, T)
-
-/* _s8 _s16
- _u8 _u16. */
-#define TYPES_bh_integer(S, D, T) \
- S (s8), S (s16), S (u8), S (u16)
-
-/* _u8 _u32. */
-#define TYPES_bs_unsigned(S, D, T) \
- S (u8), S (u32)
-
-/* _s8 _s16 _s32. */
-#define TYPES_bhs_signed(S, D, T) \
- S (s8), S (s16), S (s32)
-
-/* _u8 _u16 _u32. */
-#define TYPES_bhs_unsigned(S, D, T) \
- S (u8), S (u16), S (u32)
-
-/* _s8 _s16 _s32
- _u8 _u16 _u32. */
-#define TYPES_bhs_integer(S, D, T) \
- TYPES_bhs_signed (S, D, T), TYPES_bhs_unsigned (S, D, T)
-
-#define TYPES_bh_data(S, D, T) \
- TYPES_b_data (S, D, T), \
- TYPES_h_data (S, D, T)
-
-#define TYPES_bhs_data(S, D, T) \
- TYPES_b_data (S, D, T), \
- TYPES_h_data (S, D, T), \
- TYPES_s_data (S, D, T)
-
-/* _s16_s8 _s32_s16 _s64_s32
- _u16_u8 _u32_u16 _u64_u32. */
-#define TYPES_bhs_widen(S, D, T) \
- D (s16, s8), D (s32, s16), D (s64, s32), \
- D (u16, u8), D (u32, u16), D (u64, u32)
-
-/* _bf16. */
-#define TYPES_h_bfloat(S, D, T) \
- S (bf16)
-
-/* _f16. */
-#define TYPES_h_float(S, D, T) \
- S (f16)
-
-/* _s16
- _u16. */
-#define TYPES_h_integer(S, D, T) \
- S (s16), S (u16)
-
-/* _bf16
- _f16
- _s16
- _u16. */
-#define TYPES_h_data(S, D, T) \
- S (bf16), S (f16), TYPES_h_integer (S, D, T)
-
-/* _s16 _s32. */
-#define TYPES_hs_signed(S, D, T) \
- S (s16), S (s32)
-
-/* _s16 _s32
- _u16 _u32. */
-#define TYPES_hs_integer(S, D, T) \
- TYPES_hs_signed (S, D, T), S (u16), S (u32)
-
-/* _f16 _f32. */
-#define TYPES_hs_float(S, D, T) \
- S (f16), S (f32)
-
-#define TYPES_hs_data(S, D, T) \
- TYPES_h_data (S, D, T), \
- TYPES_s_data (S, D, T)
-
-/* _u16 _u64. */
-#define TYPES_hd_unsigned(S, D, T) \
- S (u16), S (u64)
-
-/* _s16 _s32 _s64. */
-#define TYPES_hsd_signed(S, D, T) \
- S (s16), S (s32), S (s64)
-
-/* _s16 _s32 _s64
- _u16 _u32 _u64. */
-#define TYPES_hsd_integer(S, D, T) \
- TYPES_hsd_signed (S, D, T), S (u16), S (u32), S (u64)
-
-#define TYPES_hsd_data(S, D, T) \
- TYPES_h_data (S, D, T), \
- TYPES_s_data (S, D, T), \
- TYPES_d_data (S, D, T)
-
-/* _f16_mf8. */
-#define TYPES_h_float_mf8(S, D, T) \
- D (f16, mf8)
-
-/* _f32. */
-#define TYPES_s_float(S, D, T) \
- S (f32)
-
-/* _f32_mf8. */
-#define TYPES_s_float_mf8(S, D, T) \
- D (f32, mf8)
-
-/* _f32
- _s16 _s32 _s64
- _u16 _u32 _u64. */
-#define TYPES_s_float_hsd_integer(S, D, T) \
- TYPES_s_float (S, D, T), TYPES_hsd_integer (S, D, T)
-
-/* _f32
- _s32 _s64
- _u32 _u64. */
-#define TYPES_s_float_sd_integer(S, D, T) \
- TYPES_s_float (S, D, T), TYPES_sd_integer (S, D, T)
-
-/* _s32. */
-#define TYPES_s_signed(S, D, T) \
- S (s32)
-
-/* _u32. */
-#define TYPES_s_unsigned(S, D, T) \
- S (u32)
-
-/* _s32
- _u32. */
-#define TYPES_s_integer(S, D, T) \
- TYPES_s_signed (S, D, T), TYPES_s_unsigned (S, D, T)
-
-/* _f32
- _s32
- _u32. */
-#define TYPES_s_data(S, D, T) \
- TYPES_s_float (S, D, T), TYPES_s_integer (S, D, T)
-
-/* _s32 _s64. */
-#define TYPES_sd_signed(S, D, T) \
- S (s32), S (s64)
-
-/* _u32 _u64. */
-#define TYPES_sd_unsigned(S, D, T) \
- S (u32), S (u64)
-
-/* _s32 _s64
- _u32 _u64. */
-#define TYPES_sd_integer(S, D, T) \
- TYPES_sd_signed (S, D, T), TYPES_sd_unsigned (S, D, T)
-
-#define TYPES_sd_data(S, D, T) \
- TYPES_s_data (S, D, T), \
- TYPES_d_data (S, D, T)
-
-/* _f16 _f32 _f64
- _s32 _s64
- _u32 _u64. */
-#define TYPES_all_float_and_sd_integer(S, D, T) \
- TYPES_all_float (S, D, T), TYPES_sd_integer (S, D, T)
-
-/* _f64. */
-#define TYPES_d_float(S, D, T) \
- S (f64)
-
-/* _u64. */
-#define TYPES_d_unsigned(S, D, T) \
- S (u64)
-
-/* _s64
- _u64. */
-#define TYPES_d_integer(S, D, T) \
- S (s64), TYPES_d_unsigned (S, D, T)
-
-/* _f64
- _s64
- _u64. */
-#define TYPES_d_data(S, D, T) \
- TYPES_d_float (S, D, T), TYPES_d_integer (S, D, T)
-
-/* All the type combinations allowed by svcvt. */
-#define TYPES_cvt(S, D, T) \
- D (f16, f32), D (f16, f64), \
- D (f16, s16), D (f16, s32), D (f16, s64), \
- D (f16, u16), D (f16, u32), D (f16, u64), \
- \
- D (f32, f16), D (f32, f64), \
- D (f32, s32), D (f32, s64), \
- D (f32, u32), D (f32, u64), \
- \
- D (f64, f16), D (f64, f32), \
- D (f64, s32), D (f64, s64), \
- D (f64, u32), D (f64, u64), \
- \
- D (s16, f16), \
- D (s32, f16), D (s32, f32), D (s32, f64), \
- D (s64, f16), D (s64, f32), D (s64, f64), \
- \
- D (u16, f16), \
- D (u32, f16), D (u32, f32), D (u32, f64), \
- D (u64, f16), D (u64, f32), D (u64, f64)
-
-/* _bf16_f32. */
-#define TYPES_cvt_bfloat(S, D, T) \
- D (bf16, f32)
-
-/* { _bf16 _f16 } x _f32. */
-#define TYPES_cvt_h_s_float(S, D, T) \
- D (bf16, f32), D (f16, f32)
-
-/* _f32_f16. */
-#define TYPES_cvt_f32_f16(S, D, T) \
- D (f32, f16)
-
-/* _f32_f16
- _f64_f32. */
-#define TYPES_cvt_long(S, D, T) \
- D (f32, f16), D (f64, f32)
-
-/* _f32_f64. */
-#define TYPES_cvt_narrow_s(S, D, T) \
- D (f32, f64)
-
-/* _f16_f32
- _f32_f64. */
-#define TYPES_cvt_narrow(S, D, T) \
- D (f16, f32), TYPES_cvt_narrow_s (S, D, T)
-
-/* { _s32 _u32 } x _f32
-
- _f32 x { _s32 _u32 }. */
-#define TYPES_cvt_s_s(S, D, T) \
- D (s32, f32), \
- D (u32, f32), \
- D (f32, s32), \
- D (f32, u32)
-
-/* _f16_mf8
- _bf16_mf8. */
-#define TYPES_cvt_mf8(S, D, T) \
- D (f16, mf8), D (bf16, mf8)
-
-/* _mf8_f16
- _mf8_bf16. */
-#define TYPES_cvtn_mf8(S, D, T) \
- D (mf8, f16), D (mf8, bf16)
-
-/* _mf8_f32. */
-#define TYPES_cvtnx_mf8(S, D, T) \
- D (mf8, f32)
-
-/* { _s32 _s64 } x { _b8 _b16 _b32 _b64 }
- { _u32 _u64 }. */
-#define TYPES_inc_dec_n1(D, A) \
- D (A, b8), D (A, b16), D (A, b32), D (A, b64)
-#define TYPES_inc_dec_n(S, D, T) \
- TYPES_inc_dec_n1 (D, s32), \
- TYPES_inc_dec_n1 (D, s64), \
- TYPES_inc_dec_n1 (D, u32), \
- TYPES_inc_dec_n1 (D, u64)
-
-/* { _s16 _u16 } x _s32
-
- { _u16 } x _u32. */
-#define TYPES_qcvt_x2(S, D, T) \
- D (s16, s32), \
- D (u16, u32), \
- D (u16, s32)
-
-/* { _s8 _u8 } x _s32
-
- { _u8 } x _u32
-
- { _s16 _u16 } x _s64
-
- { _u16 } x _u64. */
-#define TYPES_qcvt_x4(S, D, T) \
- D (s8, s32), \
- D (u8, u32), \
- D (u8, s32), \
- D (s16, s64), \
- D (u16, u64), \
- D (u16, s64)
-
-/* _s16_s32
- _u16_u32. */
-#define TYPES_qrshr_x2(S, D, T) \
- D (s16, s32), \
- D (u16, u32)
-
-/* _u16_s32. */
-#define TYPES_qrshru_x2(S, D, T) \
- D (u16, s32)
-
-/* _s8_s32
- _s16_s64
- _u8_u32
- _u16_u64. */
-#define TYPES_qrshr_x4(S, D, T) \
- D (s8, s32), \
- D (s16, s64), \
- D (u8, u32), \
- D (u16, u64)
-
-/* _u8_s32
- _u16_s64. */
-#define TYPES_qrshru_x4(S, D, T) \
- D (u8, s32), \
- D (u16, s64)
-
-/* { _mf8 _bf16 } { _mf8 _bf16 }
- { _f16 _f32 _f64 } { _f16 _f32 _f64 }
- { _s8 _s16 _s32 _s64 } x { _s8 _s16 _s32 _s64 }
- { _u8 _u16 _u32 _u64 } { _u8 _u16 _u32 _u64 }. */
-#define TYPES_reinterpret1(D, A) \
- D (A, mf8), \
- D (A, bf16), \
- D (A, f16), D (A, f32), D (A, f64), \
- D (A, s8), D (A, s16), D (A, s32), D (A, s64), \
- D (A, u8), D (A, u16), D (A, u32), D (A, u64)
-#define TYPES_reinterpret(S, D, T) \
- TYPES_reinterpret1 (D, mf8), \
- TYPES_reinterpret1 (D, bf16), \
- TYPES_reinterpret1 (D, f16), \
- TYPES_reinterpret1 (D, f32), \
- TYPES_reinterpret1 (D, f64), \
- TYPES_reinterpret1 (D, s8), \
- TYPES_reinterpret1 (D, s16), \
- TYPES_reinterpret1 (D, s32), \
- TYPES_reinterpret1 (D, s64), \
- TYPES_reinterpret1 (D, u8), \
- TYPES_reinterpret1 (D, u16), \
- TYPES_reinterpret1 (D, u32), \
- TYPES_reinterpret1 (D, u64)
-
-/* _b_c
- _c_b. */
-#define TYPES_reinterpret_b(S, D, T) \
- D (b, c), \
- D (c, b)
-
-/* { _b8 _b16 _b32 _b64 } x { _s32 _s64 }
- { _u32 _u64 } */
-#define TYPES_while1(D, bn) \
- D (bn, s32), D (bn, s64), D (bn, u32), D (bn, u64)
-#define TYPES_while(S, D, T) \
- TYPES_while1 (D, b8), \
- TYPES_while1 (D, b16), \
- TYPES_while1 (D, b32), \
- TYPES_while1 (D, b64)
-
-/* { _b8 _b16 _b32 _b64 } x { _s64 }
- { _u64 } */
-#define TYPES_while_x(S, D, T) \
- D (b8, s64), D (b8, u64), \
- D (b16, s64), D (b16, u64), \
- D (b32, s64), D (b32, u64), \
- D (b64, s64), D (b64, u64)
-
-/* { _c8 _c16 _c32 _c64 } x { _s64 }
- { _u64 } */
-#define TYPES_while_x_c(S, D, T) \
- D (c8, s64), D (c8, u64), \
- D (c16, s64), D (c16, u64), \
- D (c32, s64), D (c32, u64), \
- D (c64, s64), D (c64, u64)
-
-/* _f32_f16
- _s32_s16
- _u32_u16. */
-#define TYPES_s_narrow_fsu(S, D, T) \
- D (f32, f16), D (s32, s16), D (u32, u16)
-
-/* _za8 _za16 _za32 _za64 _za128. */
-#define TYPES_all_za(S, D, T) \
- S (za8), S (za16), S (za32), S (za64), S (za128)
-
-/* _za64. */
-#define TYPES_d_za(S, D, T) \
- S (za64)
-
-/* { _za8 } x { _mf8 _s8 _u8 }
-
- { _za16 } x { _bf16 _f16 _s16 _u16 }
-
- { _za32 } x { _f32 _s32 _u32 }
-
- { _za64 } x { _f64 _s64 _u64 }. */
-#define TYPES_za_bhsd_data(S, D, T) \
- D (za8, mf8), D (za8, s8), D (za8, u8), \
- D (za16, bf16), D (za16, f16), D (za16, s16), D (za16, u16), \
- D (za32, f32), D (za32, s32), D (za32, u32), \
- D (za64, f64), D (za64, s64), D (za64, u64)
-
-/* Likewise, plus:
-
- { _za128 } x { _bf16 }
- { _f16 _f32 _f64 }
- { _s8 _s16 _s32 _s64 }
- { _u8 _u16 _u32 _u64 }. */
-
-#define TYPES_za_all_data(S, D, T) \
- TYPES_za_bhsd_data (S, D, T), \
- TYPES_reinterpret1 (D, za128)
-
-/* _za16_mf8. */
-#define TYPES_za_h_mf8(S, D, T) \
- D (za16, mf8)
-
-/* { _za_16 _za_32 } x _mf8. */
-#define TYPES_za_hs_mf8(S, D, T) \
- D (za16, mf8), D (za32, mf8)
-
-/* _za16_bf16. */
-#define TYPES_za_h_bfloat(S, D, T) \
- D (za16, bf16)
-
-/* _za16_f16. */
-#define TYPES_za_h_float(S, D, T) \
- D (za16, f16)
-
-/* _za32_s8. */
-#define TYPES_za_s_b_signed(S, D, T) \
- D (za32, s8)
-
-/* _za32_u8. */
-#define TYPES_za_s_b_unsigned(S, D, T) \
- D (za32, u8)
-
-/* _za32 x { _s8 _u8 }. */
-#define TYPES_za_s_b_integer(S, D, T) \
- D (za32, s8), D (za32, u8)
-
-/* _za32 x { _s16 _u16 }. */
-#define TYPES_za_s_h_integer(S, D, T) \
- D (za32, s16), D (za32, u16)
-
-/* _za32 x { _bf16 _f16 _s16 _u16 }. */
-#define TYPES_za_s_h_data(S, D, T) \
- D (za32, bf16), D (za32, f16), D (za32, s16), D (za32, u16)
-
-/* _za32_u32. */
-#define TYPES_za_s_unsigned(S, D, T) \
- D (za32, u32)
-
-/* _za32 x { _s32 _u32 }. */
-#define TYPES_za_s_integer(S, D, T) \
- D (za32, s32), D (za32, u32)
-
-/* _za32_mf8. */
-#define TYPES_za_s_mf8(S, D, T) \
- D (za32, mf8)
-
-/* _za32_f32. */
-#define TYPES_za_s_float(S, D, T) \
- D (za32, f32)
-
-/* _za32 x { _f32 _s32 _u32 }. */
-#define TYPES_za_s_data(S, D, T) \
- D (za32, f32), D (za32, s32), D (za32, u32)
-
-/* _za64 x { _s16 _u16 }. */
-#define TYPES_za_d_h_integer(S, D, T) \
- D (za64, s16), D (za64, u16)
-
-/* _za64_f64. */
-#define TYPES_za_d_float(S, D, T) \
- D (za64, f64)
-
-/* _za64 x { _s64 _u64 }. */
-#define TYPES_za_d_integer(S, D, T) \
- D (za64, s64), D (za64, u64)
-
-/* _za32 x { _s8 _u8 _bf16 _f16 _f32 }. */
-#define TYPES_mop_base(S, D, T) \
- D (za32, s8), D (za32, u8), D (za32, bf16), D (za32, f16), D (za32, f32)
-
-/* _za32_s8. */
-#define TYPES_mop_base_signed(S, D, T) \
- D (za32, s8)
-
-/* _za32_u8. */
-#define TYPES_mop_base_unsigned(S, D, T) \
- D (za32, u8)
-
-/* _za64 x { _s16 _u16 }. */
-#define TYPES_mop_i16i64(S, D, T) \
- D (za64, s16), D (za64, u16)
-
-/* _za64_s16. */
-#define TYPES_mop_i16i64_signed(S, D, T) \
- D (za64, s16)
-
-/* _za64_u16. */
-#define TYPES_mop_i16i64_unsigned(S, D, T) \
- D (za64, u16)
-
-/* _za. */
-#define TYPES_za(S, D, T) \
- S (za)
-
-/* Describe a tuple of type suffixes in which only the first is used. */
-#define DEF_VECTOR_TYPE(X) \
- { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }
-
-/* Describe a tuple of type suffixes in which only the first two are used. */
-#define DEF_DOUBLE_TYPE(X, Y) \
- { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y, NUM_TYPE_SUFFIXES }
-
-/* Describe a tuple of type suffixes in which three elements are used. */
-#define DEF_TRIPLE_TYPE(X, Y, Z) \
- { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y, TYPE_SUFFIX_ ## Z }
-
-/* Create an array that can be used in aarch64-sve-builtins.def to
- select the type suffixes in TYPES_<NAME>. */
-#define DEF_SVE_TYPES_ARRAY(NAME) \
- static const type_suffix_triple types_##NAME[] = { \
- TYPES_##NAME (DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE, DEF_TRIPLE_TYPE), \
- { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } \
- }
-
-/* For functions that don't take any type suffixes. */
-static const type_suffix_triple types_none[] = {
- { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES },
- { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }
-};
-
-/* Create an array for each TYPES_<combination> macro above. */
-DEF_SVE_TYPES_ARRAY (all_pred);
-DEF_SVE_TYPES_ARRAY (all_count);
-DEF_SVE_TYPES_ARRAY (all_pred_count);
-DEF_SVE_TYPES_ARRAY (all_float);
-DEF_SVE_TYPES_ARRAY (all_signed);
-DEF_SVE_TYPES_ARRAY (all_float_and_signed);
-DEF_SVE_TYPES_ARRAY (all_unsigned);
-DEF_SVE_TYPES_ARRAY (all_integer);
-DEF_SVE_TYPES_ARRAY (all_arith);
-DEF_SVE_TYPES_ARRAY (all_data);
-DEF_SVE_TYPES_ARRAY (b);
-DEF_SVE_TYPES_ARRAY (b_unsigned);
-DEF_SVE_TYPES_ARRAY (b_integer);
-DEF_SVE_TYPES_ARRAY (bh_integer);
-DEF_SVE_TYPES_ARRAY (bs_unsigned);
-DEF_SVE_TYPES_ARRAY (bhs_signed);
-DEF_SVE_TYPES_ARRAY (bhs_unsigned);
-DEF_SVE_TYPES_ARRAY (bhs_integer);
-DEF_SVE_TYPES_ARRAY (bh_data);
-DEF_SVE_TYPES_ARRAY (bhs_data);
-DEF_SVE_TYPES_ARRAY (bhs_widen);
-DEF_SVE_TYPES_ARRAY (c);
-DEF_SVE_TYPES_ARRAY (h_bfloat);
-DEF_SVE_TYPES_ARRAY (h_float);
-DEF_SVE_TYPES_ARRAY (h_float_mf8);
-DEF_SVE_TYPES_ARRAY (h_integer);
-DEF_SVE_TYPES_ARRAY (h_data);
-DEF_SVE_TYPES_ARRAY (hs_signed);
-DEF_SVE_TYPES_ARRAY (hs_integer);
-DEF_SVE_TYPES_ARRAY (hs_float);
-DEF_SVE_TYPES_ARRAY (hs_data);
-DEF_SVE_TYPES_ARRAY (hd_unsigned);
-DEF_SVE_TYPES_ARRAY (hsd_signed);
-DEF_SVE_TYPES_ARRAY (hsd_integer);
-DEF_SVE_TYPES_ARRAY (hsd_data);
-DEF_SVE_TYPES_ARRAY (s_float);
-DEF_SVE_TYPES_ARRAY (s_float_hsd_integer);
-DEF_SVE_TYPES_ARRAY (s_float_mf8);
-DEF_SVE_TYPES_ARRAY (s_float_sd_integer);
-DEF_SVE_TYPES_ARRAY (s_signed);
-DEF_SVE_TYPES_ARRAY (s_unsigned);
-DEF_SVE_TYPES_ARRAY (s_integer);
-DEF_SVE_TYPES_ARRAY (s_data);
-DEF_SVE_TYPES_ARRAY (sd_signed);
-DEF_SVE_TYPES_ARRAY (sd_unsigned);
-DEF_SVE_TYPES_ARRAY (sd_integer);
-DEF_SVE_TYPES_ARRAY (sd_data);
-DEF_SVE_TYPES_ARRAY (all_float_and_sd_integer);
-DEF_SVE_TYPES_ARRAY (d_float);
-DEF_SVE_TYPES_ARRAY (d_unsigned);
-DEF_SVE_TYPES_ARRAY (d_integer);
-DEF_SVE_TYPES_ARRAY (d_data);
-DEF_SVE_TYPES_ARRAY (cvt);
-DEF_SVE_TYPES_ARRAY (cvt_bfloat);
-DEF_SVE_TYPES_ARRAY (cvt_h_s_float);
-DEF_SVE_TYPES_ARRAY (cvt_f32_f16);
-DEF_SVE_TYPES_ARRAY (cvt_long);
-DEF_SVE_TYPES_ARRAY (cvt_mf8);
-DEF_SVE_TYPES_ARRAY (cvt_narrow_s);
-DEF_SVE_TYPES_ARRAY (cvt_narrow);
-DEF_SVE_TYPES_ARRAY (cvt_s_s);
-DEF_SVE_TYPES_ARRAY (cvtn_mf8);
-DEF_SVE_TYPES_ARRAY (cvtnx_mf8);
-DEF_SVE_TYPES_ARRAY (inc_dec_n);
-DEF_SVE_TYPES_ARRAY (qcvt_x2);
-DEF_SVE_TYPES_ARRAY (qcvt_x4);
-DEF_SVE_TYPES_ARRAY (qrshr_x2);
-DEF_SVE_TYPES_ARRAY (qrshr_x4);
-DEF_SVE_TYPES_ARRAY (qrshru_x2);
-DEF_SVE_TYPES_ARRAY (qrshru_x4);
-DEF_SVE_TYPES_ARRAY (reinterpret);
-DEF_SVE_TYPES_ARRAY (reinterpret_b);
-DEF_SVE_TYPES_ARRAY (while);
-DEF_SVE_TYPES_ARRAY (while_x);
-DEF_SVE_TYPES_ARRAY (while_x_c);
-DEF_SVE_TYPES_ARRAY (s_narrow_fsu);
-DEF_SVE_TYPES_ARRAY (all_za);
-DEF_SVE_TYPES_ARRAY (d_za);
-DEF_SVE_TYPES_ARRAY (za_bhsd_data);
-DEF_SVE_TYPES_ARRAY (za_all_data);
-DEF_SVE_TYPES_ARRAY (za_h_mf8);
-DEF_SVE_TYPES_ARRAY (za_h_bfloat);
-DEF_SVE_TYPES_ARRAY (za_h_float);
-DEF_SVE_TYPES_ARRAY (za_s_b_signed);
-DEF_SVE_TYPES_ARRAY (za_s_b_unsigned);
-DEF_SVE_TYPES_ARRAY (za_s_b_integer);
-DEF_SVE_TYPES_ARRAY (za_s_h_integer);
-DEF_SVE_TYPES_ARRAY (za_s_h_data);
-DEF_SVE_TYPES_ARRAY (za_s_unsigned);
-DEF_SVE_TYPES_ARRAY (za_s_integer);
-DEF_SVE_TYPES_ARRAY (za_s_mf8);
-DEF_SVE_TYPES_ARRAY (za_hs_mf8);
-DEF_SVE_TYPES_ARRAY (za_s_float);
-DEF_SVE_TYPES_ARRAY (za_s_data);
-DEF_SVE_TYPES_ARRAY (za_d_h_integer);
-DEF_SVE_TYPES_ARRAY (za_d_float);
-DEF_SVE_TYPES_ARRAY (za_d_integer);
-DEF_SVE_TYPES_ARRAY (mop_base);
-DEF_SVE_TYPES_ARRAY (mop_base_signed);
-DEF_SVE_TYPES_ARRAY (mop_base_unsigned);
-DEF_SVE_TYPES_ARRAY (mop_i16i64);
-DEF_SVE_TYPES_ARRAY (mop_i16i64_signed);
-DEF_SVE_TYPES_ARRAY (mop_i16i64_unsigned);
-DEF_SVE_TYPES_ARRAY (za);
-
-static const group_suffix_index groups_none[] = {
- GROUP_none, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_x2[] = { GROUP_x2, NUM_GROUP_SUFFIXES };
-
-static const group_suffix_index groups_x12[] = {
- GROUP_none, GROUP_x2, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_x4[] = { GROUP_x4, NUM_GROUP_SUFFIXES };
-
-static const group_suffix_index groups_x24[] = {
- GROUP_x2, GROUP_x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_x124[] = {
- GROUP_none, GROUP_x2, GROUP_x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_x1234[] = {
- GROUP_none, GROUP_x2, GROUP_x3, GROUP_x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg1x2[] = {
- GROUP_vg1x2, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg1x4[] = {
- GROUP_vg1x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg1x24[] = {
- GROUP_vg1x2, GROUP_vg1x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg2[] = {
- GROUP_vg2x1, GROUP_vg2x2, GROUP_vg2x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg4[] = {
- GROUP_vg4x1, GROUP_vg4x2, GROUP_vg4x4, NUM_GROUP_SUFFIXES
-};
-
-static const group_suffix_index groups_vg24[] = {
- GROUP_vg2, GROUP_vg4, NUM_GROUP_SUFFIXES
-};
-
-/* Used by functions that have no governing predicate. */
-static const predication_index preds_none[] = { PRED_none, NUM_PREDS };
-
-/* Used by functions that have a governing predicate but do not have an
- explicit suffix. */
-static const predication_index preds_implicit[] = { PRED_implicit, NUM_PREDS };
-
-/* Used by functions that only support "_m" predication. */
-static const predication_index preds_m[] = { PRED_m, NUM_PREDS };
-
-/* Used by functions that allow merging and "don't care" predication,
- but are not suitable for predicated MOVPRFX. */
-static const predication_index preds_mx[] = {
- PRED_m, PRED_x, NUM_PREDS
-};
-
-/* Used by functions that allow merging, zeroing and "don't care"
- predication. */
-static const predication_index preds_mxz[] = {
- PRED_m, PRED_x, PRED_z, NUM_PREDS
-};
-
-/* Used by functions that have the mxz predicated forms above, and in addition
- have an unpredicated form. */
-static const predication_index preds_mxz_or_none[] = {
- PRED_m, PRED_x, PRED_z, PRED_none, NUM_PREDS
-};
-
-/* Used by functions that allow merging and zeroing predication but have
- no "_x" form. */
-static const predication_index preds_mz[] = { PRED_m, PRED_z, NUM_PREDS };
-
-/* Used by functions that have an unpredicated form and a _z predicated
- form. */
-static const predication_index preds_z_or_none[] = {
- PRED_z, PRED_none, NUM_PREDS
-};
-
-/* Used by (mostly predicate) functions that only support "_z" predication. */
-static const predication_index preds_z[] = { PRED_z, NUM_PREDS };
-
-/* Used by SME instructions that always merge into ZA. */
-static const predication_index preds_za_m[] = { PRED_za_m, NUM_PREDS };
-
-#define NONSTREAMING_SVE(X) nonstreaming_only (AARCH64_FL_SVE | (X))
-#define SVE_AND_SME(X, Y) streaming_compatible (AARCH64_FL_SVE | (X), (Y))
-#define SSVE(X) SVE_AND_SME (X, X)
-
/* A list of all arm_sve.h functions. */
static CONSTEXPR const function_group_info function_groups[] = {
#define DEF_SVE_FUNCTION_GS_FPM(NAME, SHAPE, TYPES, GROUPS, PREDS, FPM_MODE) \
@@ -1346,6 +539,9 @@ function_builder::function_builder (handle_pragma_index
pragma_index,
m_function_nulls = function_nulls;
gcc_obstack_init (&m_string_obstack);
+
+ if (!function_table)
+ function_table = hash_table<registered_function_hasher>::create_ggc (1023);
}
function_builder::~function_builder ()
@@ -3859,9 +3055,9 @@ gimple_folder::fold_to_stmt_vops (gimple *g)
gimple *
gimple_folder::fold ()
{
- /* Don't fold anything when SVE is disabled; emit an error during
+ /* Don't fold anything when NEON/SVE are disabled; emit an error during
expansion instead. */
- if (!TARGET_SVE)
+ if (!TARGET_SIMD && !TARGET_SVE)
return NULL;
/* Punt if the function has a return type and no result location is
@@ -4772,6 +3968,7 @@ init_builtins ()
register_builtin_types ();
if (in_lto_p)
{
+ aarch64_acle::handle_arm_neon_h (false);
handle_arm_sve_h (false);
handle_arm_sme_h (false);
handle_arm_neon_sve_bridge_h (false);
@@ -4874,13 +4071,27 @@ register_svprfop ()
"svprfop", &values);
}
+static bool arm_sve_h_handled = false;
+static location_t arm_sve_h_location;
+
+static bool arm_sme_h_handled = false;
+static location_t arm_sme_h_location;
+
+static bool arm_neon_sve_bridge_h_handled = false;
+static location_t arm_neon_sve_bridge_h_location;
+
/* Implement #pragma GCC aarch64 "arm_sve.h". */
void
handle_arm_sve_h (bool function_nulls_p)
{
- if (function_table)
+ if (!aarch64_acle::arm_neon_h_handled)
+ aarch64_acle::handle_arm_neon_h (false);
+
+ if (arm_sve_h_handled)
{
error ("duplicate definition of %qs", "arm_sve.h");
+ inform (arm_sve_h_location, "previous definition of %qs here",
+ "arm_sve.h");
return;
}
@@ -4903,25 +4114,38 @@ handle_arm_sve_h (bool function_nulls_p)
register_svprfop ();
/* Define the functions. */
- function_table = hash_table<registered_function_hasher>::create_ggc (1023);
function_builder builder (arm_sve_handle, function_nulls_p);
for (unsigned int i = 0; i < ARRAY_SIZE (function_groups); ++i)
builder.register_function_group (function_groups[i]);
+
+ arm_sve_h_handled = true;
+ arm_sve_h_location = input_location;
}
/* Implement #pragma GCC aarch64 "arm_neon_sve_bridge.h". */
void
handle_arm_neon_sve_bridge_h (bool function_nulls_p)
{
- if (initial_indexes[arm_sme_handle] == 0)
+ if (!arm_sme_h_handled)
handle_arm_sme_h (true);
+ if (arm_neon_sve_bridge_h_handled)
+ {
+ error ("duplicate definition of %qs", "arm_neon_sve_bridge.h");
+ inform (arm_neon_sve_bridge_h_location, "previous definition of %qs
here",
+ "arm_neon_sve_bridge.h");
+ return;
+ }
+
aarch64_target_switcher switcher;
/* Define the functions. */
function_builder builder (arm_neon_sve_handle, function_nulls_p);
for (unsigned int i = 0; i < ARRAY_SIZE (neon_sve_function_groups); ++i)
builder.register_function_group (neon_sve_function_groups[i]);
+
+ arm_neon_sve_bridge_h_handled = true;
+ arm_neon_sve_bridge_h_location = input_location;
}
/* Return the function decl with SVE function subcode CODE, or error_mark_node
@@ -4938,7 +4162,7 @@ builtin_decl (unsigned int code, bool)
void
handle_arm_sme_h (bool function_nulls_p)
{
- if (!function_table)
+ if (!arm_sve_h_handled)
{
error ("%qs defined without first defining %qs",
"arm_sme.h", "arm_sve.h");
@@ -4950,6 +4174,9 @@ handle_arm_sme_h (bool function_nulls_p)
function_builder builder (arm_sme_handle, function_nulls_p);
for (unsigned int i = 0; i < ARRAY_SIZE (sme_function_groups); ++i)
builder.register_function_group (sme_function_groups[i]);
+
+ arm_sme_h_handled = true;
+ arm_sme_h_location = input_location;
}
/* If we're implementing manual overloading, check whether the SVE
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def
b/gcc/config/aarch64/aarch64-sve-builtins.def
index 6ad257643b69..6df8d41d7f63 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.def
+++ b/gcc/config/aarch64/aarch64-sve-builtins.def
@@ -26,6 +26,8 @@
- aarch64-sve-builtins.def for common data types, groups, and other
supporting definitions used across all files.
+ - aarch64-neon-builtins-base.def for any AdvSIMD intrinsic that is
+ enabled by the +simd extension.
- aarch64-sve-builtins-base.def for the baseline SVE intrinsics which
predate
SVE2 and SME.
- aarch64-sve-builtins-sve2.def for any scalable SIMD intrinsic that is
@@ -159,6 +161,15 @@ DEF_SVE_NEON_TYPE_SUFFIX (u32, svuint32_t, unsigned, 32,
VNx4SImode,
DEF_SVE_NEON_TYPE_SUFFIX (u64, svuint64_t, unsigned, 64, VNx2DImode,
Uint64x1_t, Uint64x2_t)
+DEF_SVE_NEON_TYPE_SUFFIX (p8, svuint8_t, poly, 8, VNx16QImode,
+ Poly8x8_t, Poly8x16_t)
+DEF_SVE_NEON_TYPE_SUFFIX (p16, svuint16_t, poly, 16, VNx8HImode,
+ Poly16x4_t, Poly16x8_t)
+DEF_SVE_NEON_TYPE_SUFFIX (p64, svuint64_t, poly, 64, VNx2DImode,
+ Poly64x1_t, Poly64x2_t)
+DEF_SVE_NEON_TYPE_SUFFIX (p128, svuint64_t, poly, 128, TImode,
+ Poly128_t, Poly128_t)
+
/* Associate _za with bytes. This is needed for svldr_vnum_za and
svstr_vnum_za, whose ZA offset can be in the range [0, 15], as for za8. */
DEF_SME_ZA_SUFFIX (za, 8, VNx16QImode)
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index 82cf94b51739..b5acb0c9321e 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -193,147 +193,6 @@
__vec; \
})
-/* vadd */
-__extension__ extern __inline int8x8_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_s8 (int8x8_t __a, int8x8_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int16x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_s16 (int16x4_t __a, int16x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int32x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_s32 (int32x2_t __a, int32x2_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline float32x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_f32 (float32x2_t __a, float32x2_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline float64x1_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_f64 (float64x1_t __a, float64x1_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint8x8_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_u8 (uint8x8_t __a, uint8x8_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint16x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_u16 (uint16x4_t __a, uint16x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint32x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_u32 (uint32x2_t __a, uint32x2_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int64x1_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_s64 (int64x1_t __a, int64x1_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint64x1_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_u64 (uint64x1_t __a, uint64x1_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int8x16_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_s8 (int8x16_t __a, int8x16_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int16x8_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_s16 (int16x8_t __a, int16x8_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int32x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_s32 (int32x4_t __a, int32x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline int64x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_s64 (int64x2_t __a, int64x2_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline float32x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_f32 (float32x4_t __a, float32x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline float64x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_f64 (float64x2_t __a, float64x2_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint8x16_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_u8 (uint8x16_t __a, uint8x16_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint16x8_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_u16 (uint16x8_t __a, uint16x8_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint32x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_u32 (uint32x4_t __a, uint32x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline uint64x2_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_u64 (uint64x2_t __a, uint64x2_t __b)
-{
- return __a + __b;
-}
-
__extension__ extern __inline int16x8_t
__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
vaddl_s8 (int8x8_t __a, int8x8_t __b)
@@ -25904,20 +25763,6 @@ vsqrtq_f16 (float16x8_t __a)
/* ARMv8.2-A FP16 two operands vector intrinsics. */
-__extension__ extern __inline float16x4_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_f16 (float16x4_t __a, float16x4_t __b)
-{
- return __a + __b;
-}
-
-__extension__ extern __inline float16x8_t
-__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_f16 (float16x8_t __a, float16x8_t __b)
-{
- return __a + __b;
-}
-
__extension__ extern __inline float16x4_t
__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
vabd_f16 (float16x4_t __a, float16x4_t __b)
@@ -28526,55 +28371,6 @@ vusmmlaq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t
__b)
#pragma GCC pop_options
-__extension__ extern __inline poly8x8_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_p8 (poly8x8_t __a, poly8x8_t __b)
-{
- return __a ^ __b;
-}
-
-__extension__ extern __inline poly16x4_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_p16 (poly16x4_t __a, poly16x4_t __b)
-{
- return __a ^ __b;
-}
-
-__extension__ extern __inline poly64x1_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vadd_p64 (poly64x1_t __a, poly64x1_t __b)
-{
- return __a ^ __b;
-}
-
-__extension__ extern __inline poly8x16_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_p8 (poly8x16_t __a, poly8x16_t __b)
-{
- return __a ^ __b;
-}
-
-__extension__ extern __inline poly16x8_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_p16 (poly16x8_t __a, poly16x8_t __b)
-{
- return __a ^__b;
-}
-
-__extension__ extern __inline poly64x2_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_p64 (poly64x2_t __a, poly64x2_t __b)
-{
- return __a ^ __b;
-}
-
-__extension__ extern __inline poly128_t
-__attribute ((__always_inline__, __gnu_inline__, __artificial__))
-vaddq_p128 (poly128_t __a, poly128_t __b)
-{
- return __a ^ __b;
-}
-
#undef __aarch64_vget_lane_any
#undef __aarch64_vdup_lane_any
diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64
index 1171d2023490..66b302192ada 100644
--- a/gcc/config/aarch64/t-aarch64
+++ b/gcc/config/aarch64/t-aarch64
@@ -67,6 +67,52 @@ aarch64-builtins.o:
$(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) \
$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
$(srcdir)/config/aarch64/aarch64-builtins.cc
+aarch64-neon-builtins.o: \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.cc \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.def \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.def \
+ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \
+ $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) $(DIAGNOSTIC_H) \
+ $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \
+ gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) tree-vector-builder.h \
+ stor-layout.h alias.h gimple-fold.h langhooks.h \
+ stringpool.h \
+ $(srcdir)/config/aarch64/aarch64-acle-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-shapes.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.h
+ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.cc
+
+aarch64-neon-builtins-shapes.o: \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-shapes.cc \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.def \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.def \
+ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \
+ $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) \
+ $(srcdir)/config/aarch64/aarch64-acle-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-shapes.h
+ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-shapes.cc
+
+aarch64-neon-builtins-base.o: \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.cc \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.def \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.def \
+ $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \
+ $(TM_P_H) memmodel.h insn-codes.h $(OPTABS_H) $(RECOG_H) \
+ $(EXPR_H) $(BASIC_BLOCK_H) $(FUNCTION_H) fold-const.h $(GIMPLE_H) \
+ gimple-iterator.h gimplify.h explow.h $(EMIT_RTL_H) tree-vector-builder.h \
+ rtx-vector-builder.h vec-perm-indices.h \
+ $(srcdir)/config/aarch64/aarch64-acle-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-shapes.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.h \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-functions.h
+ $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
+ $(srcdir)/config/aarch64/aarch64-neon-builtins-base.cc
+
aarch64-sve-builtins.o: $(srcdir)/config/aarch64/aarch64-sve-builtins.cc \
$(srcdir)/config/aarch64/aarch64-sve-builtins.def \
$(srcdir)/config/aarch64/aarch64-sve-builtins-base.def \
diff --git a/gcc/testsuite/g++.target/aarch64/pr103147-6.C
b/gcc/testsuite/g++.target/aarch64/pr103147-6.C
index 15a606f976c8..bbea67b9b7db 100644
--- a/gcc/testsuite/g++.target/aarch64/pr103147-6.C
+++ b/gcc/testsuite/g++.target/aarch64/pr103147-6.C
@@ -1,3 +1,4 @@
/* { dg-options "-mgeneral-regs-only" } */
+/* { dg-excess-errors "arm_neon.h" } */
#include <arm_neon.h>
diff --git a/gcc/testsuite/g++.target/aarch64/pr117048.C
b/gcc/testsuite/g++.target/aarch64/pr117048.C
index ae46e5875e4c..a9775700c5bf 100644
--- a/gcc/testsuite/g++.target/aarch64/pr117048.C
+++ b/gcc/testsuite/g++.target/aarch64/pr117048.C
@@ -30,5 +30,5 @@ void G(
v[12] = vgetq_lane_s64(vd01, 0);
}
-/* { dg-final { scan-assembler {\txar\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d,
32\n} } } */
+/* { dg-final { scan-assembler {\txar\tv[0-9]+\.2d, v[0-9]+\.2d, v[0-9]+\.2d,
#?32\n} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/neon/aarch64-neon.exp
b/gcc/testsuite/gcc.target/aarch64/neon/aarch64-neon.exp
new file mode 100644
index 000000000000..03c4467e5354
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/neon/aarch64-neon.exp
@@ -0,0 +1,39 @@
+# Specific regression driver for AArch64 NEON.
+# Copyright (C) 2026-2026 Free Software Foundation, Inc.
+# Contributed by ARM Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3. If not see
+# <http://www.gnu.org/licenses/>. */
+
+# GCC testsuite that uses the `dg.exp' driver.
+
+# Exit immediately if this isn't an AArch64 target.
+if {![istarget aarch64*-*-*] } then {
+ return
+}
+
+# Load support procs.
+load_lib gcc-dg.exp
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*\[cCs\]]] \
+ " -ansi -pedantic-errors -std=c23 -O3 -march=armv8-a+simd" ""
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/gcc.target/aarch64/neon/arm_neon_test.h
b/gcc/testsuite/gcc.target/aarch64/neon/arm_neon_test.h
new file mode 100644
index 000000000000..7d9371e01047
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/neon/arm_neon_test.h
@@ -0,0 +1,22 @@
+#include "arm_neon.h"
+
+#pragma GCC target "+simd+fp16+bf16+sha3"
+
+#define TEST_UNARY(NAME, RET_TYPE, ARG_1_TYPE)
\
+ RET_TYPE test_##NAME (ARG_1_TYPE a) { return NAME (a); }
+
+#define TEST_UNIFORM_UNARY(NAME, TYPE) TEST_UNARY (NAME, TYPE, TYPE)
+
+#define TEST_BINARY(NAME, RET_TYPE, ARG_1_TYPE, ARG_2_TYPE)
\
+ RET_TYPE test_##NAME (ARG_1_TYPE a, ARG_2_TYPE b) { return NAME (a, b); }
+
+#define TEST_UNIFORM_BINARY(NAME, TYPE) TEST_BINARY (NAME, TYPE, TYPE, TYPE)
+
+#define TEST_TERNARY(NAME, RET_TYPE, ARG_1_TYPE, ARG_2_TYPE, ARG_3_TYPE)
\
+ RET_TYPE test_##NAME (ARG_1_TYPE a, ARG_2_TYPE b, ARG_3_TYPE c)
\
+ {
\
+ return NAME (a, b, c);
\
+ }
+
+#define TEST_UNIFORM_TERNARY(NAME, TYPE)
\
+ TEST_TERNARY (NAME, TYPE, TYPE, TYPE, TYPE)
diff --git a/gcc/testsuite/gcc.target/aarch64/neon/vadd.c
b/gcc/testsuite/gcc.target/aarch64/neon/vadd.c
new file mode 100644
index 000000000000..e622718685db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/neon/vadd.c
@@ -0,0 +1,203 @@
+/* { dg-do compile } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "arm_neon_test.h"
+
+/*
+** test_vadd_u8:
+** add v0\.8b, (v0\.8b, v1\.8b|v1\.8b, v0\.8b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_u8, uint8x8_t)
+
+/*
+** test_vadd_s8:
+** add v0\.8b, (v0\.8b, v1\.8b|v1\.8b, v0\.8b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_s8, int8x8_t)
+
+/*
+** test_vadd_p8:
+** eor v0\.8b, (v0\.8b, v1\.8b|v1\.8b, v0\.8b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_p8, poly8x8_t)
+
+/*
+** test_vadd_u16:
+** add v0\.4h, (v0\.4h, v1\.4h|v1\.4h, v0\.4h)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_u16, uint16x4_t)
+
+/*
+** test_vadd_s16:
+** add v0\.4h, (v0\.4h, v1\.4h|v1\.4h, v0\.4h)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_s16, int16x4_t)
+
+/*
+** test_vadd_p16:
+** eor v0\.8b, (v0\.8b, v1\.8b|v1\.8b, v0\.8b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_p16, poly16x4_t)
+
+/*
+** test_vadd_u32:
+** add v0\.2s, (v0\.2s, v1\.2s|v1\.2s, v0\.2s)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_u32, uint32x2_t)
+
+/*
+** test_vadd_s32:
+** add v0\.2s, (v0\.2s, v1\.2s|v1\.2s, v0\.2s)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_s32, int32x2_t)
+
+/*
+** test_vadd_u64:
+** add d0, (d0, d1|d1, d0)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_u64, uint64x1_t)
+
+/*
+** test_vadd_s64:
+** add d0, (d0, d1|d1, d0)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_s64, int64x1_t)
+
+/*
+** test_vadd_p64:
+** eor v0\.8b, (v0\.8b, v1\.8b|v1\.8b, v0\.8b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vadd_p64, poly64x1_t)
+
+/*
+** test_vaddq_u8:
+** add v0\.16b, (v0\.16b, v1\.16b|v1\.16b, v0\.16b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_u8, uint8x16_t)
+
+/*
+** test_vaddq_s8:
+** add v0\.16b, (v0\.16b, v1\.16b|v1\.16b, v0\.16b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_s8, int8x16_t)
+
+/*
+** test_vaddq_p8:
+** eor v0\.16b, (v0\.16b, v1\.16b|v1\.16b, v0\.16b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_p8, poly8x16_t)
+
+/*
+** test_vaddq_u16:
+** add v0\.8h, (v0\.8h, v1\.8h|v1\.8h, v0\.8h)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_u16, uint16x8_t)
+
+/*
+** test_vaddq_s16:
+** add v0\.8h, (v0\.8h, v1\.8h|v1\.8h, v0\.8h)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_s16, int16x8_t)
+
+/*
+** test_vaddq_f16:
+** fadd v0\.8h, (v0\.8h, v1\.8h|v1\.8h, v0\.8h)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_f16, float16x8_t)
+
+/*
+** test_vaddq_p16:
+** eor v0\.16b, (v0\.16b, v1\.16b|v1\.16b, v0\.16b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_p16, poly16x8_t)
+
+/*
+** test_vaddq_u32:
+** add v0\.4s, (v0\.4s, v1\.4s|v1\.4s, v0\.4s)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_u32, uint32x4_t)
+
+/*
+** test_vaddq_s32:
+** add v0\.4s, (v0\.4s, v1\.4s|v1\.4s, v0\.4s)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_s32, int32x4_t)
+
+/*
+** test_vaddq_f32:
+** fadd v0\.4s, (v0\.4s, v1\.4s|v1\.4s, v0\.4s)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_f32, float32x4_t)
+
+/*
+** test_vaddq_u64:
+** add v0\.2d, (v0\.2d, v1\.2d|v1\.2d, v0\.2d)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_u64, uint64x2_t)
+
+/*
+** test_vaddq_s64:
+** add v0\.2d, (v0\.2d, v1\.2d|v1\.2d, v0\.2d)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_s64, int64x2_t)
+
+/*
+** test_vaddq_f64:
+** fadd v0\.2d, (v0\.2d, v1\.2d|v1\.2d, v0\.2d)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_f64, float64x2_t)
+
+/*
+** test_vaddq_p64:
+** eor v0\.16b, (v0\.16b, v1\.16b|v1\.16b, v0\.16b)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_p64, poly64x2_t)
+
+/* `poly128_t` is a scalar type, like `__uint128_t`, so it is passed in two GPR
+ registers. *
+/*
+** test_vaddq_p128:
+** eor x[0-9], x[0-9]+, x[0-9]+
+** eor x[0-9], x[0-9]+, x[0-9]+
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddq_p128, poly128_t)
+
+/*
+** test_vaddd_u64:
+** add x0, (x0, x1|x1, x0)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddd_u64, uint64_t)
+
+/*
+** test_vaddd_s64:
+** add x0, (x0, x1|x1, x0)
+** ret
+*/
+TEST_UNIFORM_BINARY (vaddd_s64, int64_t)
diff --git a/gcc/testsuite/gcc.target/aarch64/pr103147-6.c
b/gcc/testsuite/gcc.target/aarch64/pr103147-6.c
index 15a606f976c8..bbea67b9b7db 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr103147-6.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr103147-6.c
@@ -1,3 +1,4 @@
/* { dg-options "-mgeneral-regs-only" } */
+/* { dg-excess-errors "arm_neon.h" } */
#include <arm_neon.h>
--
2.54.0