[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #20 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:fd0ab7c734b04b91653467b94afd48ceca122083 commit r12-7356-gfd0ab7c734b04b91653467b94afd48ceca122083 Author: Christophe Lyon Date: Wed Feb 23 06:44:12 2022 + arm: Fix typo in auto-vectorized MVE comparisons I made a last minute renaming of mve_const_bool_vec_to_hi () into mve_bool_vec_to_const () and forgot to update the call sites in vfp.md accordingly. Committed as obvious. 2022-02-23 Christophe Lyon gcc/ PR target/100757 PR target/101325 * config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Fix typo.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Christophe Lyon changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #19 from Christophe Lyon --- Should be fixed, at last.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #18 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:c6b4ea7ab1aa6c5c07798fa6c6ad15dd1761b5ed commit r12-7344-gc6b4ea7ab1aa6c5c07798fa6c6ad15dd1761b5ed Author: Christophe Lyon Date: Wed Oct 13 09:16:49 2021 + arm: Convert more MVE/CDE builtins to predicate qualifiers This patch covers a few non-load/store builtins where we do not use the iterator and thus we cannot use . Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon gcc/ PR target/100757 PR target/101325 * config/arm/arm-builtins.cc (CX_UNARY_UNONE_QUALIFIERS): Use predicate. (CX_BINARY_UNONE_QUALIFIERS): Likewise. (CX_TERNARY_UNONE_QUALIFIERS): Likewise. (TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete. (QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Delete. * config/arm/arm_mve_builtins.def: Use predicated qualifiers. * config/arm/mve.md: Use VxBI instead of HI.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #17 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:6a7c13a0cf2290b60ab36f9ce1027b92838586bd commit r12-7343-g6a7c13a0cf2290b60ab36f9ce1027b92838586bd Author: Christophe Lyon Date: Wed Oct 20 15:39:17 2021 + arm: Convert more load/store MVE builtins to predicate qualifiers This patch covers a few builtins where we do not use the iterator and thus we cannot use . For v2di instructions, we keep the HI mode for predicates. Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon gcc/ PR target/100757 PR target/101325 * config/arm/arm-builtins.cc (STRSBS_P_QUALIFIERS): Use predicate qualifier. (STRSBU_P_QUALIFIERS): Likewise. (LDRGBS_Z_QUALIFIERS): Likewise. (LDRGBU_Z_QUALIFIERS): Likewise. (LDRGBWBXU_Z_QUALIFIERS): Likewise. (LDRGBWBS_Z_QUALIFIERS): Likewise. (LDRGBWBU_Z_QUALIFIERS): Likewise. (STRSBWBS_P_QUALIFIERS): Likewise. (STRSBWBU_P_QUALIFIERS): Likewise. * config/arm/mve.md: Use VxBI instead of HI.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #16 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:724d6566cd11c676f3bc082a9771784c825affb1 commit r12-7342-g724d6566cd11c676f3bc082a9771784c825affb1 Author: Christophe Lyon Date: Wed Oct 13 09:16:40 2021 + arm: Convert more MVE builtins to predicate qualifiers This patch covers all builtins that have an HI operand and use the iterator, thus we can replace HI whe . Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon gcc/ PR target/100757 PR target/101325 * config/arm/arm-builtins.cc (TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ... (TERNOP_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this. (TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ... (TERNOP_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this. (TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ... (TERNOP_NONE_NONE_IMM_PRED_QUALIFIERS): ... this. (TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Change to ... (TERNOP_NONE_NONE_UNONE_PRED_QUALIFIERS): ... this. (QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ... (QUADOP_UNONE_UNONE_NONE_NONE_PRED_QUALIFIERS): ... this. (QUADOP_NONE_NONE_NONE_NONE_PRED_QUALIFIERS): New. (QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ... (QUADOP_NONE_NONE_NONE_IMM_PRED_QUALIFIERS): ... this. (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New. (QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Change to ... (QUADOP_UNONE_UNONE_NONE_IMM_PRED_QUALIFIERS): ... this. (QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ... (QUADOP_NONE_NONE_UNONE_IMM_PRED_QUALIFIERS): ... this. (QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ... (QUADOP_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this. (QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ... (QUADOP_UNONE_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this. (STRS_P_QUALIFIERS): Use predicate qualifier. (STRU_P_QUALIFIERS): Likewise. (STRSU_P_QUALIFIERS): Likewise. (STRSS_P_QUALIFIERS): Likewise. (LDRGS_Z_QUALIFIERS): Likewise. (LDRGU_Z_QUALIFIERS): Likewise. (LDRS_Z_QUALIFIERS): Likewise. (LDRU_Z_QUALIFIERS): Likewise. (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ... (QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this. (BINOP_NONE_NONE_PRED_QUALIFIERS): New. (BINOP_UNONE_UNONE_PRED_QUALIFIERS): New. * config/arm/arm_mve_builtins.def: Use new predicated qualifiers. * config/arm/mve.md: Use MVE_VPRED instead of HI.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #15 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:e6a4aefce8e47a7d3ba781066a1410ebfa963e59 commit r12-7341-ge6a4aefce8e47a7d3ba781066a1410ebfa963e59 Author: Christophe Lyon Date: Wed Oct 13 09:16:35 2021 + arm: Convert remaining MVE vcmp builtins to predicate qualifiers This is mostly a mechanical change, only tested by the intrinsics expansion tests. Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon gcc/ PR target/100757 PR target/101325 * config/arm/arm-builtins.cc (BINOP_UNONE_NONE_NONE_QUALIFIERS): Delete. (TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ... (TERNOP_PRED_NONE_NONE_PRED_QUALIFIERS): ... this. (TERNOP_PRED_UNONE_UNONE_PRED_QUALIFIERS): New. * config/arm/arm_mve_builtins.def (vcmp*q_n_, vcmp*q_m_f): Use new predicated qualifiers. * config/arm/mve.md (mve_vcmpq_n_) (mve_vcmp*q_m_f): Use MVE_VPRED instead of HI.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #14 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:91224cf625dc90304bb515a0cc602beed48fe3da commit r12-7339-g91224cf625dc90304bb515a0cc602beed48fe3da Author: Christophe Lyon Date: Wed Oct 13 09:16:27 2021 + arm: Implement auto-vectorized MVE comparisons with vectors of boolean predicates We make use of qualifier_predicate to describe MVE builtins prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins, as they are exercised by the tests added earlier in the series. Special handling is needed for mve_vpselq because it has a v2di variant, which has no natural VPR.P0 representation: we keep HImode for it. The vector_compare expansion code is updated to use the right VxBI mode instead of HI for the result. We extend the existing thumb2_movhi_vfp and thumb2_movhi_fp16 patterns to use the new MVE_7_HI iterator which covers HI and the new VxBI modes, in conjunction with the new DB constraint for a constant vector of booleans. This patch also adds tests derived from the one provided in PR target/101325: there is a compile-only test because I did not have access to anything that could execute MVE code until recently. I have been able to add an executable test since QEMU supports MVE. Instead of adding arm_v8_1m_mve_hw, I update arm_mve_hw so that it uses add_options_for_arm_v8_1m_mve_fp, like arm_neon_hw does. This ensures arm_mve_hw passes even if the toolchain does not generate MVE code by default. Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon Richard Sandiford gcc/ PR target/100757 PR target/101325 * config/arm/arm-builtins.cc (BINOP_PRED_UNONE_UNONE_QUALIFIERS) (BINOP_PRED_NONE_NONE_QUALIFIERS) (TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS) (TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New. * config/arm/arm-protos.h (mve_bool_vec_to_const): New. * config/arm/arm.cc (arm_hard_regno_mode_ok): Handle new VxBI modes. (arm_mode_to_pred_mode): New. (arm_expand_vector_compare): Use the right VxBI mode instead of HI. (arm_expand_vcond): Likewise. (simd_valid_immediate): Handle MODE_VECTOR_BOOL. (mve_bool_vec_to_const): New. (neon_make_constant): Call mve_bool_vec_to_const when needed. * config/arm/arm_mve_builtins.def (vcmpneq_, vcmphiq_, vcmpcsq_) (vcmpltq_, vcmpleq_, vcmpgtq_, vcmpgeq_, vcmpeqq_, vcmpneq_f) (vcmpltq_f, vcmpleq_f, vcmpgtq_f, vcmpgeq_f, vcmpeqq_f, vpselq_u) (vpselq_s, vpselq_f): Use new predicated qualifiers. * config/arm/constraints.md (DB): New. * config/arm/iterators.md (MVE_7, MVE_7_HI): New mode iterators. (MVE_VPRED, MVE_vpred): New attribute iterators. * config/arm/mve.md (@mve_vcmpq_) (@mve_vcmpq_f, @mve_vpselq_) (@mve_vpselq_f): Use MVE_VPRED instead of HI. (@mve_vpselq_v2di): Define separately. (mov): New expander for VxBI modes. * config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Use MVE_7_HI iterator and add support for DB constraint. gcc/testsuite/ PR target/100757 PR target/101325 * gcc.dg/rtl/arm/mve-vxbi.c: New test. * gcc.target/arm/simd/pr101325.c: New. * gcc.target/arm/simd/pr101325-2.c: New. * lib/target-supports.exp (check_effective_target_arm_mve_hw): Use add_options_for_arm_v8_1m_mve_fp.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #13 from CVS Commits --- The master branch has been updated by Christophe Lyon : https://gcc.gnu.org/g:884f77b489510e1df9db2889b60c5df6fcda commit r12-7338-g884f77b489510e1df9db2889b60c5df6fcda Author: Christophe Lyon Date: Wed Oct 13 09:16:22 2021 + arm: Implement MVE predicates as vectors of booleans This patch implements support for vectors of booleans to support MVE predicates, instead of HImode. Since the ABI mandates pred16_t (aka uint16_t) to represent predicates in intrinsics prototypes, we introduce a new "predicate" type qualifier so that we can map relevant builtins HImode arguments and return value to the appropriate vector of booleans (VxBI). We have to update test_vector_ops_duplicate, because it iterates using an offset in bytes, where we would need to iterate in bits: we stop iterating when we reach the end of the vector of booleans. In addition, we have to fix the underlying definition of vectors of booleans because ARM/MVE needs a different representation than AArch64/SVE. With ARM/MVE the 'true' bit is duplicated over the element size, so that a true element of V4BI is represented by '0b'. This patch updates the aarch64 definition of VNx*BI as needed. Most of the work of this patch series was carried out while I was working at STMicroelectronics as a Linaro assignee. 2022-02-22 Christophe Lyon Richard Sandiford gcc/ PR target/100757 PR target/101325 * config/aarch64/aarch64-modes.def (VNx16BI, VNx8BI, VNx4BI, VNx2BI): Update definition. * config/arm/arm-builtins.cc (arm_init_simd_builtin_types): Add new simd types. (arm_init_builtin): Map predicate vectors arguments to HImode. (arm_expand_builtin_args): Move HImode predicate arguments to VxBI rtx. Move return value to HImode rtx. * config/arm/arm-builtins.h (arm_type_qualifiers): Add qualifier_predicate. * config/arm/arm-modes.def (B2I, B4I, V16BI, V8BI, V4BI): New modes. * config/arm/arm-simd-builtin-types.def (Pred1x16_t, Pred2x8_t,Pred4x4_t): New. * emit-rtl.cc (init_emit_once): Handle all boolean modes. * genmodes.cc (mode_data): Add boolean field. (blank_mode): Initialize it. (make_complex_modes): Fix handling of boolean modes. (make_vector_modes): Likewise. (VECTOR_BOOL_MODE): Use new COMPONENT parameter. (make_vector_bool_mode): Likewise. (BOOL_MODE): New. (make_bool_mode): New. (emit_insn_modes_h): Fix generation of boolean modes. (emit_class_narrowest_mode): Likewise. * machmode.def: (VECTOR_BOOL_MODE): Document new COMPONENT parameter. Use new BOOL_MODE instead of FRACTIONAL_INT_MODE to define BImode. * rtx-vector-builder.cc (rtx_vector_builder::find_cached_value): Fix handling of constm1_rtx for VECTOR_BOOL. * simplify-rtx.cc (native_encode_rtx): Fix support for VECTOR_BOOL. (native_decode_vector_rtx): Likewise. (test_vector_ops_duplicate): Skip vec_merge test with vectors of booleans. * varasm.cc (output_constant_pool_2): Likewise.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Richard Biener changed: What|Removed |Added Priority|P3 |P1
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #12 from Christophe Lyon --- As I am going on holidays until August (back only 2 days until then), I thought I should share my WIP here. No sure that's the right direction, anyway that's not working yet. a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index fa0fb0b..cf2c9b8 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -1633,6 +1633,10 @@ arm_init_simd_builtin_types (void) arm_simd_types[Bfloat16x4_t].eltype = arm_bf16_type_node; arm_simd_types[Bfloat16x8_t].eltype = arm_bf16_type_node; + arm_simd_types[Pred1x16_t].eltype = unsigned_intHI_type_node; + arm_simd_types[Pred2x8_t].eltype = unsigned_intHI_type_node; + arm_simd_types[Pred4x4_t].eltype = unsigned_intHI_type_node; + for (i = 0; i < nelts; i++) { tree eltype = arm_simd_types[i].eltype; diff --git a/gcc/config/arm/arm-modes.def b/gcc/config/arm/arm-modes.def index a5e74ba..098831c 100644 --- a/gcc/config/arm/arm-modes.def +++ b/gcc/config/arm/arm-modes.def @@ -84,6 +84,15 @@ VECTOR_MODE (FLOAT, BF, 2); /* V2BF. */ VECTOR_MODE (FLOAT, BF, 4); /*V4BF. */ VECTOR_MODE (FLOAT, BF, 8); /*V8BF. */ +/* Predicates for MVE. */ +VECTOR_BOOL_MODE (VNx16BI, 16, 2); +VECTOR_BOOL_MODE (VNx8BI, 8, 2); +VECTOR_BOOL_MODE (VNx4BI, 4, 2); + +ADJUST_NUNITS (VNx16BI, arm_vg * 8); +ADJUST_NUNITS (VNx8BI, arm_vg * 4); +ADJUST_NUNITS (VNx4BI, arm_vg * 2); + /* Fraction and accumulator vector modes. */ VECTOR_MODES (FRACT, 4); /* V4QQ V2HQ */ VECTOR_MODES (UFRACT, 4); /* V4UQQ V2UHQ */ diff --git a/gcc/config/arm/arm-simd-builtin-types.def b/gcc/config/arm/arm-simd-builtin-types.def index c19a1b6..6a5053f 100644 --- a/gcc/config/arm/arm-simd-builtin-types.def +++ b/gcc/config/arm/arm-simd-builtin-types.def @@ -51,3 +51,7 @@ ENTRY (Bfloat16x2_t, V2BF, none, 32, bfloat16, 20) ENTRY (Bfloat16x4_t, V4BF, none, 64, bfloat16, 20) ENTRY (Bfloat16x8_t, V8BF, none, 128, bfloat16, 20) + + ENTRY (Pred1x16_t, VNx16BI, unsigned, 16, uint16, 21) + ENTRY (Pred2x8_t, VNx8BI, unsigned, 8, uint16, 21) + ENTRY (Pred4x4_t, VNx4BI, unsigned, 4, uint16, 21) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index f967239..98ff238 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -3446,6 +3446,8 @@ arm_configure_build_target (struct arm_build_target *target, arm_option_reconfigure_globals (); } +poly_uint16 arm_vg; + /* Fix up any incompatible options that the user has specified. */ static void arm_option_override (void) @@ -3458,6 +3460,7 @@ arm_option_override (void) static const enum isa_feature quirk_bitlist[] = { ISA_ALL_QUIRKS, isa_nobit}; cl_target_option opts; + arm_vg = 2; isa_quirkbits = sbitmap_alloc (isa_num_bits); arm_initialize_isa (isa_quirkbits, quirk_bitlist); diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 8e5bd57..df9bbb2 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2493,4 +2493,7 @@ const char *arm_be8_option (int argc, const char **argv); representation for SHF_ARM_PURECODE in GCC. */ #define SECTION_ARM_PURECODE SECTION_MACH_DEP +#ifndef USED_FOR_TARGET +extern poly_uint16 arm_vg; +#endif #endif /* ! GCC_ARM_H */ diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 83f1003..765ec5a 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -3524,7 +3524,7 @@ __arm_vaddlvq_u32 (uint32x4_t __a) return __builtin_mve_vaddlvq_uv4si (__a); } -__extension__ extern __inline int64_t +__extension__ extern __inline mve_pred16_t/*int64_t*/ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vctp16q (uint32_t __a) { diff --git a/gcc/config/arm/arm_mve_types.h b/gcc/config/arm/arm_mve_types.h index 8958f4e..536e816 100644 --- a/gcc/config/arm/arm_mve_types.h +++ b/gcc/config/arm/arm_mve_types.h @@ -34,7 +34,8 @@ typedef struct { float32x4_t val[2]; } float32x4x2_t; typedef struct { float32x4_t val[4]; } float32x4x4_t; #endif -typedef uint16_t mve_pred16_t; +//typedef uint16_t mve_pred16_t; +typedef __simd16_uint16_t mve_pred16_t; typedef __simd128_uint8_t uint8x16_t; typedef __simd128_uint16_t uint16x8_t; typedef __simd128_uint32_t uint32x4_t; diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 5c4fe89..2656f6b 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -948,6 +948,8 @@ (define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32") (V8HF "u16") (V4SF "32")]) (define_mode_attr earlyclobber_32 [(V16QI "=w") (V8HI "=w") (V4SI "=") (V8HF "=w") (V4SF "=")]) +;;(define_mode_attr MVE_VPRED [(V16QI "VNx16BI") (V8HI "VNx8BI") (V4SI "VNx4BI")]) +(define_mode_attr MVE_VPRED [(V16QI "VNx16BI") (V8HI "VNx16BI") (V4SI "VNx16BI") (V8HF "VNx16BI") (V4SF "VNx16BI")])
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #11 from Richard Earnshaw --- (In reply to Christophe Lyon from comment #10) > This was introduced by my change at r12-671 in mve.md: > -;; [vcmpneq_]) > +;; [vcmpneq_, vcmpcsq_, vcmpeqq_, vcmpgeq_, vcmpgtq_, vcmphiq_, vcmpleq_, > vcmpltq_]) > ;; > -(define_insn "mve_vcmpneq_" > +(define_insn "mve_vcmpq_" >[ > (set (match_operand:HI 0 "vpr_register_operand" "=Up") > - (unspec:HI [(match_operand:MVE_2 1 "s_register_operand" "w") > - (match_operand:MVE_2 2 "s_register_operand" "w")] > -VCMPNEQ)) > + (MVE_COMPARISONS:HI (match_operand:MVE_2 1 "s_register_operand" "w") > + (match_operand:MVE_2 2 "s_register_operand" "w"))) > + ] > + "TARGET_HAVE_MVE" > + "vcmp.%# , %q1, %q2" > + [(set_attr "type" "mve_move") > +]) > > So should that use MVE_COMPARISONS:MVE_2, and somehow include a conversion > from MVE_2 modes into HI for vpr.p0 in the pattern? Thinking about it some more, it might be the use of HI on the 'eq' that is incorrect. If the result were a vector, then the construct might have more meaning. I think you'll need to look at how predicates are handled in the aarch64 SVE code. For example, you might need to introduce the concept of VECTOR_BOOL_MODEs.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #10 from Christophe Lyon --- This was introduced by my change at r12-671 in mve.md: -;; [vcmpneq_]) +;; [vcmpneq_, vcmpcsq_, vcmpeqq_, vcmpgeq_, vcmpgtq_, vcmphiq_, vcmpleq_, vcmpltq_]) ;; -(define_insn "mve_vcmpneq_" +(define_insn "mve_vcmpq_" [ (set (match_operand:HI 0 "vpr_register_operand" "=Up") - (unspec:HI [(match_operand:MVE_2 1 "s_register_operand" "w") - (match_operand:MVE_2 2 "s_register_operand" "w")] -VCMPNEQ)) + (MVE_COMPARISONS:HI (match_operand:MVE_2 1 "s_register_operand" "w") + (match_operand:MVE_2 2 "s_register_operand" "w"))) + ] + "TARGET_HAVE_MVE" + "vcmp.%# , %q1, %q2" + [(set_attr "type" "mve_move") +]) So should that use MVE_COMPARISONS:MVE_2, and somehow include a conversion from MVE_2 modes into HI for vpr.p0 in the pattern?
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #9 from Richard Earnshaw --- (insn 7 4 8 2 (set (reg:HI 117) (eq:HI (reg:V16QI 119) (reg:V16QI 120))) {mve_vcmpeqq_v16qi} (expr_list:REG_DEAD (reg:V16QI 120) (expr_list:REG_DEAD (reg:V16QI 119) (nil This is wrong. EQ has a result of 0 or 1, regardless of the mode.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #8 from Christophe Lyon --- Indeed, it's what happens in try_combine(): i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0); converts i2src (zero_extend:SI (reg:HI 117)) into: (and:SI (subreg:SI (reg:HI 117) 0) (const_int 1 [0x1]))
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #7 from Christophe Lyon --- Before the patch: Trying 8 -> 14: 8: r113:SI=zero_extend(r117:HI) REG_DEAD r117:HI 14: r0:SI=r113:SI REG_DEAD r113:SI Successfully matched this instruction: (set (reg/i:SI 0 r0) (zero_extend:SI (reg:HI 117))) allowing combination of insns 8 and 14 original costs 4 + 2 = 6 replacement cost 4 deferring deletion of insn with uid = 8. modifying insn i314: r0:SI=zero_extend(r117:HI) REG_DEAD r117:HI deferring rescan insn with uid = 14. After: Trying 8 -> 14: 8: r113:SI=zero_extend(r117:HI) REG_DEAD r117:HI 14: r0:SI=r113:SI REG_DEAD r113:SI Successfully matched this instruction: (set (reg/i:SI 0 r0) (and:SI (subreg:SI (reg:HI 117) 0) (const_int 1 [0x1]))) allowing combination of insns 8 and 14 original costs 4 + 2 = 6 replacement cost 4 deferring deletion of insn with uid = 8. modifying insn i314: r0:SI=r117:HI#0&0x1 REG_DEAD r117:HI deferring rescan insn with uid = 14. So with the same inputs, combine makes a different decision, I guess that's because it has some knowledge that reg:HI 117 is the result of a comparison rather than an unspec.
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #6 from Christophe Lyon --- Before r12-671 before combine we have: (insn 7 4 8 2 (set (reg:HI 117) (unspec:HI [ (reg/v:V16QI 115 [ v ]) (reg/v:V16QI 116 [ w ]) ] VCMPEQQ_S)) "arm_mve.h":4210:10 3228 {mve_vcmpeqq_v16qi} (expr_list:REG_DEAD (reg/v:V16QI 116 [ w ]) (expr_list:REG_DEAD (reg/v:V16QI 115 [ v ]) (nil (insn 8 7 14 2 (set (reg:SI 113 [ _5 ]) (zero_extend:SI (reg:HI 117))) "arm_mve.h":4210:10 1019 {*thumb2_zero_extendhisi2_v6} (expr_list:REG_DEAD (reg:HI 117) (nil))) After r12-671, we have: (insn 7 4 8 2 (set (reg:HI 117) (eq:HI (reg/v:V16QI 115 [ v ]) (reg/v:V16QI 116 [ w ]))) "arm_mve.h":4210:10 3173 {mve_vcmpeqq_v16qi} (expr_list:REG_DEAD (reg/v:V16QI 116 [ w ]) (expr_list:REG_DEAD (reg/v:V16QI 115 [ v ]) (nil (insn 8 7 14 2 (set (reg:SI 113 [ _5 ]) (zero_extend:SI (reg:HI 117))) "arm_mve.h":4210:10 1019 {*thumb2_zero_extendhisi2_v6} (expr_list:REG_DEAD (reg:HI 117) (nil)))
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #5 from Christophe Lyon --- Before the patch, combine says: allowing combination of insns 8 and 14 original costs 4 + 2 = 6 replacement cost 4 deferring deletion of insn with uid = 8. modifying insn i314: r0:SI=zero_extend(r117:HI) REG_DEAD r117:HI deferring rescan insn with uid = 14. starting the processing of deferred insns rescanning insn with uid = 7. rescanning insn with uid = 14. ending the processing of deferred insns After the patch: allowing combination of insns 8 and 14 original costs 4 + 2 = 6 replacement cost 4 deferring deletion of insn with uid = 8. modifying insn i314: r0:SI=r117:HI#0&0x1 REG_DEAD r117:HI deferring rescan insn with uid = 14. starting the processing of deferred insns rescanning insn with uid = 7. rescanning insn with uid = 14. ending the processing of deferred insns
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Christophe Lyon changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2021-07-07 Ever confirmed|0 |1 --- Comment #4 from Christophe Lyon --- Before r12-671, in combine we have: (insn 7 4 8 2 (set (reg:HI 117) (unspec:HI [ (reg:V16QI 119) (reg:V16QI 120) ] VCMPEQQ_S)) {mve_vcmpeqq_v16qi} (expr_list:REG_DEAD (reg:V16QI 120) (expr_list:REG_DEAD (reg:V16QI 119) (nil (note 8 7 14 2 NOTE_INSN_DELETED) (insn 14 8 15 2 (set (reg/i:SI 0 r0) (zero_extend:SI (reg:HI 117))) "pr101325.c":7:1 1019 {*thumb2_zero_extendhisi2_v6} (expr_list:REG_DEAD (reg:HI 117) (nil))) After the patch: (insn 7 4 8 2 (set (reg:HI 117) (eq:HI (reg:V16QI 119) (reg:V16QI 120))) {mve_vcmpeqq_v16qi} (expr_list:REG_DEAD (reg:V16QI 120) (expr_list:REG_DEAD (reg:V16QI 119) (nil (note 8 7 14 2 NOTE_INSN_DELETED) (insn 14 8 15 2 (set (reg/i:SI 0 r0) (and:SI (subreg:SI (reg:HI 117) 0) (const_int 1 [0x1]))) "pr101325.c":7:1 90 {*arm_andsi3_insn} (expr_list:REG_DEAD (reg:HI 117) (nil)))
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Christophe Lyon changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |clyon at gcc dot gnu.org --- Comment #3 from Christophe Lyon --- I will have a look
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Alex Coplan changed: What|Removed |Added Summary|[12 Regression] arm: Wrong |[12 Regression] arm: Wrong |code with MVE vcmpeqq |code with MVE vcmpeqq |intrinsic |intrinsic since ||r12-671-gd083fbf72 CC||clyon at gcc dot gnu.org --- Comment #2 from Alex Coplan --- Started with r12-671-gd083fbf72d4533d2009c725524983e1184981e74: commit d083fbf72d4533d2009c725524983e1184981e74 Author: Christophe Lyon Date: Mon May 10 13:52:02 2021 arm: MVE: Factorize all vcmp* integer patterns
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 Richard Biener changed: What|Removed |Added Target Milestone|--- |12.0
[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325 --- Comment #1 from Alex Coplan --- Execution test for the testsuite: #include __attribute((noinline,noipa)) unsigned foo(int8x16_t v, int8x16_t w) { return vcmpeqq (v, w); } int main(void) { if (foo (vdupq_n_s8(0), vdupq_n_s8(0)) != 0xU) __builtin_abort (); }