[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #20 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:fd0ab7c734b04b91653467b94afd48ceca122083

commit r12-7356-gfd0ab7c734b04b91653467b94afd48ceca122083
Author: Christophe Lyon 
Date:   Wed Feb 23 06:44:12 2022 +

arm: Fix typo in auto-vectorized MVE comparisons

I made a last minute renaming of mve_const_bool_vec_to_hi () into
mve_bool_vec_to_const () and forgot to update the call sites in vfp.md
accordingly.

Committed as obvious.

2022-02-23  Christophe Lyon 

gcc/
PR target/100757
PR target/101325
* config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Fix
typo.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Christophe Lyon  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #19 from Christophe Lyon  ---
Should be fixed, at last.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #18 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:c6b4ea7ab1aa6c5c07798fa6c6ad15dd1761b5ed

commit r12-7344-gc6b4ea7ab1aa6c5c07798fa6c6ad15dd1761b5ed
Author: Christophe Lyon 
Date:   Wed Oct 13 09:16:49 2021 +

arm: Convert more MVE/CDE builtins to predicate qualifiers

This patch covers a few non-load/store builtins where we do not use
the  iterator and thus we cannot use .

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  

gcc/
PR target/100757
PR target/101325
* config/arm/arm-builtins.cc (CX_UNARY_UNONE_QUALIFIERS): Use
predicate.
(CX_BINARY_UNONE_QUALIFIERS): Likewise.
(CX_TERNARY_UNONE_QUALIFIERS): Likewise.
(TERNOP_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
(QUADOP_NONE_NONE_NONE_NONE_UNONE_QUALIFIERS): Delete.
(QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE_QUALIFIERS): Delete.
* config/arm/arm_mve_builtins.def: Use predicated qualifiers.
* config/arm/mve.md: Use VxBI instead of HI.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #17 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:6a7c13a0cf2290b60ab36f9ce1027b92838586bd

commit r12-7343-g6a7c13a0cf2290b60ab36f9ce1027b92838586bd
Author: Christophe Lyon 
Date:   Wed Oct 20 15:39:17 2021 +

arm: Convert more load/store MVE builtins to predicate qualifiers

This patch covers a few builtins where we do not use the 
iterator and thus we cannot use .

For v2di instructions, we keep the HI mode for predicates.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  

gcc/
PR target/100757
PR target/101325
* config/arm/arm-builtins.cc (STRSBS_P_QUALIFIERS): Use predicate
qualifier.
(STRSBU_P_QUALIFIERS): Likewise.
(LDRGBS_Z_QUALIFIERS): Likewise.
(LDRGBU_Z_QUALIFIERS): Likewise.
(LDRGBWBXU_Z_QUALIFIERS): Likewise.
(LDRGBWBS_Z_QUALIFIERS): Likewise.
(LDRGBWBU_Z_QUALIFIERS): Likewise.
(STRSBWBS_P_QUALIFIERS): Likewise.
(STRSBWBU_P_QUALIFIERS): Likewise.
* config/arm/mve.md: Use VxBI instead of HI.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #16 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:724d6566cd11c676f3bc082a9771784c825affb1

commit r12-7342-g724d6566cd11c676f3bc082a9771784c825affb1
Author: Christophe Lyon 
Date:   Wed Oct 13 09:16:40 2021 +

arm: Convert more MVE builtins to predicate qualifiers

This patch covers all builtins that have an HI operand and use the
 iterator, thus we can replace HI whe .

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  

gcc/
PR target/100757
PR target/101325
* config/arm/arm-builtins.cc
(TERNOP_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ...
(TERNOP_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this.
(TERNOP_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
(TERNOP_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
(TERNOP_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
(TERNOP_NONE_NONE_IMM_PRED_QUALIFIERS): ... this.
(TERNOP_NONE_NONE_UNONE_UNONE_QUALIFIERS): Change to ...
(TERNOP_NONE_NONE_UNONE_PRED_QUALIFIERS): ... this.
(QUADOP_UNONE_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ...
(QUADOP_UNONE_UNONE_NONE_NONE_PRED_QUALIFIERS): ... this.
(QUADOP_NONE_NONE_NONE_NONE_PRED_QUALIFIERS): New.
(QUADOP_NONE_NONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
(QUADOP_NONE_NONE_NONE_IMM_PRED_QUALIFIERS): ... this.
(QUADOP_UNONE_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New.
(QUADOP_UNONE_UNONE_NONE_IMM_UNONE_QUALIFIERS): Change to ...
(QUADOP_UNONE_UNONE_NONE_IMM_PRED_QUALIFIERS): ... this.
(QUADOP_NONE_NONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
(QUADOP_NONE_NONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
(QUADOP_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to ...
(QUADOP_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
(QUADOP_UNONE_UNONE_UNONE_NONE_UNONE_QUALIFIERS): Change to ...
(QUADOP_UNONE_UNONE_UNONE_NONE_PRED_QUALIFIERS): ... this.
(STRS_P_QUALIFIERS): Use predicate qualifier.
(STRU_P_QUALIFIERS): Likewise.
(STRSU_P_QUALIFIERS): Likewise.
(STRSS_P_QUALIFIERS): Likewise.
(LDRGS_Z_QUALIFIERS): Likewise.
(LDRGU_Z_QUALIFIERS): Likewise.
(LDRS_Z_QUALIFIERS): Likewise.
(LDRU_Z_QUALIFIERS): Likewise.
(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_UNONE_QUALIFIERS): Change to
...
(QUINOP_UNONE_UNONE_UNONE_UNONE_IMM_PRED_QUALIFIERS): ... this.
(BINOP_NONE_NONE_PRED_QUALIFIERS): New.
(BINOP_UNONE_UNONE_PRED_QUALIFIERS): New.
* config/arm/arm_mve_builtins.def: Use new predicated qualifiers.
* config/arm/mve.md: Use MVE_VPRED instead of HI.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #15 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:e6a4aefce8e47a7d3ba781066a1410ebfa963e59

commit r12-7341-ge6a4aefce8e47a7d3ba781066a1410ebfa963e59
Author: Christophe Lyon 
Date:   Wed Oct 13 09:16:35 2021 +

arm: Convert remaining MVE vcmp builtins to predicate qualifiers

This is mostly a mechanical change, only tested by the intrinsics
expansion tests.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  

gcc/
PR target/100757
PR target/101325
* config/arm/arm-builtins.cc (BINOP_UNONE_NONE_NONE_QUALIFIERS):
Delete.
(TERNOP_UNONE_NONE_NONE_UNONE_QUALIFIERS): Change to ...
(TERNOP_PRED_NONE_NONE_PRED_QUALIFIERS): ... this.
(TERNOP_PRED_UNONE_UNONE_PRED_QUALIFIERS): New.
* config/arm/arm_mve_builtins.def (vcmp*q_n_, vcmp*q_m_f): Use new
predicated qualifiers.
* config/arm/mve.md (mve_vcmpq_n_)
(mve_vcmp*q_m_f): Use MVE_VPRED instead of HI.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #14 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:91224cf625dc90304bb515a0cc602beed48fe3da

commit r12-7339-g91224cf625dc90304bb515a0cc602beed48fe3da
Author: Christophe Lyon 
Date:   Wed Oct 13 09:16:27 2021 +

arm: Implement auto-vectorized MVE comparisons with vectors of boolean
predicates

We make use of qualifier_predicate to describe MVE builtins
prototypes, restricting to auto-vectorizable vcmp* and vpsel builtins,
as they are exercised by the tests added earlier in the series.

Special handling is needed for mve_vpselq because it has a v2di
variant, which has no natural VPR.P0 representation: we keep HImode
for it.

The vector_compare expansion code is updated to use the right VxBI
mode instead of HI for the result.

We extend the existing thumb2_movhi_vfp and thumb2_movhi_fp16 patterns
to use the new MVE_7_HI iterator which covers HI and the new VxBI
modes, in conjunction with the new DB constraint for a constant vector
of booleans.

This patch also adds tests derived from the one provided in PR
target/101325: there is a compile-only test because I did not have
access to anything that could execute MVE code until recently.  I have
been able to add an executable test since QEMU supports MVE.

Instead of adding arm_v8_1m_mve_hw, I update arm_mve_hw so that it
uses add_options_for_arm_v8_1m_mve_fp, like arm_neon_hw does.  This
ensures arm_mve_hw passes even if the toolchain does not generate MVE
code by default.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon 
Richard Sandiford  

gcc/
PR target/100757
PR target/101325
* config/arm/arm-builtins.cc (BINOP_PRED_UNONE_UNONE_QUALIFIERS)
(BINOP_PRED_NONE_NONE_QUALIFIERS)
(TERNOP_NONE_NONE_NONE_PRED_QUALIFIERS)
(TERNOP_UNONE_UNONE_UNONE_PRED_QUALIFIERS): New.
* config/arm/arm-protos.h (mve_bool_vec_to_const): New.
* config/arm/arm.cc (arm_hard_regno_mode_ok): Handle new VxBI
modes.
(arm_mode_to_pred_mode): New.
(arm_expand_vector_compare): Use the right VxBI mode instead of
HI.
(arm_expand_vcond): Likewise.
(simd_valid_immediate): Handle MODE_VECTOR_BOOL.
(mve_bool_vec_to_const): New.
(neon_make_constant): Call mve_bool_vec_to_const when needed.
* config/arm/arm_mve_builtins.def (vcmpneq_, vcmphiq_, vcmpcsq_)
(vcmpltq_, vcmpleq_, vcmpgtq_, vcmpgeq_, vcmpeqq_, vcmpneq_f)
(vcmpltq_f, vcmpleq_f, vcmpgtq_f, vcmpgeq_f, vcmpeqq_f, vpselq_u)
(vpselq_s, vpselq_f): Use new predicated qualifiers.
* config/arm/constraints.md (DB): New.
* config/arm/iterators.md (MVE_7, MVE_7_HI): New mode iterators.
(MVE_VPRED, MVE_vpred): New attribute iterators.
* config/arm/mve.md (@mve_vcmpq_)
(@mve_vcmpq_f, @mve_vpselq_)
(@mve_vpselq_f): Use MVE_VPRED instead of HI.
(@mve_vpselq_v2di): Define separately.
(mov): New expander for VxBI modes.
* config/arm/vfp.md (thumb2_movhi_vfp, thumb2_movhi_fp16): Use
MVE_7_HI iterator and add support for DB constraint.

gcc/testsuite/
PR target/100757
PR target/101325
* gcc.dg/rtl/arm/mve-vxbi.c: New test.
* gcc.target/arm/simd/pr101325.c: New.
* gcc.target/arm/simd/pr101325-2.c: New.
* lib/target-supports.exp (check_effective_target_arm_mve_hw): Use
add_options_for_arm_v8_1m_mve_fp.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-02-22 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #13 from CVS Commits  ---
The master branch has been updated by Christophe Lyon :

https://gcc.gnu.org/g:884f77b489510e1df9db2889b60c5df6fcda

commit r12-7338-g884f77b489510e1df9db2889b60c5df6fcda
Author: Christophe Lyon 
Date:   Wed Oct 13 09:16:22 2021 +

arm: Implement MVE predicates as vectors of booleans

This patch implements support for vectors of booleans to support MVE
predicates, instead of HImode.  Since the ABI mandates pred16_t (aka
uint16_t) to represent predicates in intrinsics prototypes, we
introduce a new "predicate" type qualifier so that we can map relevant
builtins HImode arguments and return value to the appropriate vector
of booleans (VxBI).

We have to update test_vector_ops_duplicate, because it iterates using
an offset in bytes, where we would need to iterate in bits: we stop
iterating when we reach the end of the vector of booleans.

In addition, we have to fix the underlying definition of vectors of
booleans because ARM/MVE needs a different representation than
AArch64/SVE. With ARM/MVE the 'true' bit is duplicated over the
element size, so that a true element of V4BI is represented by
'0b'.  This patch updates the aarch64 definition of VNx*BI as
needed.

Most of the work of this patch series was carried out while I was
working at STMicroelectronics as a Linaro assignee.

2022-02-22  Christophe Lyon  
Richard Sandiford  

gcc/
PR target/100757
PR target/101325
* config/aarch64/aarch64-modes.def (VNx16BI, VNx8BI, VNx4BI,
VNx2BI): Update definition.
* config/arm/arm-builtins.cc (arm_init_simd_builtin_types): Add new
simd types.
(arm_init_builtin): Map predicate vectors arguments to HImode.
(arm_expand_builtin_args): Move HImode predicate arguments to VxBI
rtx. Move return value to HImode rtx.
* config/arm/arm-builtins.h (arm_type_qualifiers): Add
qualifier_predicate.
* config/arm/arm-modes.def (B2I, B4I, V16BI, V8BI, V4BI): New
modes.
* config/arm/arm-simd-builtin-types.def (Pred1x16_t,
Pred2x8_t,Pred4x4_t): New.
* emit-rtl.cc (init_emit_once): Handle all boolean modes.
* genmodes.cc (mode_data): Add boolean field.
(blank_mode): Initialize it.
(make_complex_modes): Fix handling of boolean modes.
(make_vector_modes): Likewise.
(VECTOR_BOOL_MODE): Use new COMPONENT parameter.
(make_vector_bool_mode): Likewise.
(BOOL_MODE): New.
(make_bool_mode): New.
(emit_insn_modes_h): Fix generation of boolean modes.
(emit_class_narrowest_mode): Likewise.
* machmode.def: (VECTOR_BOOL_MODE): Document new COMPONENT
parameter.  Use new BOOL_MODE instead of FRACTIONAL_INT_MODE to
define BImode.
* rtx-vector-builder.cc (rtx_vector_builder::find_cached_value):
Fix handling of constm1_rtx for VECTOR_BOOL.
* simplify-rtx.cc (native_encode_rtx): Fix support for VECTOR_BOOL.
(native_decode_vector_rtx): Likewise.
(test_vector_ops_duplicate): Skip vec_merge test
with vectors of booleans.
* varasm.cc (output_constant_pool_2): Likewise.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2022-01-17 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Richard Biener  changed:

   What|Removed |Added

   Priority|P3  |P1

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-09 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #12 from Christophe Lyon  ---
As I am going on holidays until August (back only 2 days until then), I thought
I should share my WIP here. No sure that's the right direction, anyway that's
not working yet.

a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index fa0fb0b..cf2c9b8 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -1633,6 +1633,10 @@ arm_init_simd_builtin_types (void)
   arm_simd_types[Bfloat16x4_t].eltype = arm_bf16_type_node;
   arm_simd_types[Bfloat16x8_t].eltype = arm_bf16_type_node;

+  arm_simd_types[Pred1x16_t].eltype = unsigned_intHI_type_node;
+  arm_simd_types[Pred2x8_t].eltype = unsigned_intHI_type_node;
+  arm_simd_types[Pred4x4_t].eltype = unsigned_intHI_type_node;
+
   for (i = 0; i < nelts; i++)
 {
   tree eltype = arm_simd_types[i].eltype;
diff --git a/gcc/config/arm/arm-modes.def b/gcc/config/arm/arm-modes.def
index a5e74ba..098831c 100644
--- a/gcc/config/arm/arm-modes.def
+++ b/gcc/config/arm/arm-modes.def
@@ -84,6 +84,15 @@ VECTOR_MODE (FLOAT, BF, 2);   /* V2BF.  */
 VECTOR_MODE (FLOAT, BF, 4);   /*V4BF.  */
 VECTOR_MODE (FLOAT, BF, 8);   /*V8BF.  */

+/* Predicates for MVE.  */
+VECTOR_BOOL_MODE (VNx16BI, 16, 2);
+VECTOR_BOOL_MODE (VNx8BI, 8, 2);
+VECTOR_BOOL_MODE (VNx4BI, 4, 2);
+
+ADJUST_NUNITS (VNx16BI, arm_vg * 8);
+ADJUST_NUNITS (VNx8BI, arm_vg * 4);
+ADJUST_NUNITS (VNx4BI, arm_vg * 2);
+
 /* Fraction and accumulator vector modes.  */
 VECTOR_MODES (FRACT, 4);  /* V4QQ  V2HQ */
 VECTOR_MODES (UFRACT, 4); /* V4UQQ V2UHQ */
diff --git a/gcc/config/arm/arm-simd-builtin-types.def
b/gcc/config/arm/arm-simd-builtin-types.def
index c19a1b6..6a5053f 100644
--- a/gcc/config/arm/arm-simd-builtin-types.def
+++ b/gcc/config/arm/arm-simd-builtin-types.def
@@ -51,3 +51,7 @@
   ENTRY (Bfloat16x2_t, V2BF, none, 32, bfloat16, 20)
   ENTRY (Bfloat16x4_t, V4BF, none, 64, bfloat16, 20)
   ENTRY (Bfloat16x8_t, V8BF, none, 128, bfloat16, 20)
+
+  ENTRY (Pred1x16_t, VNx16BI, unsigned, 16, uint16, 21)
+  ENTRY (Pred2x8_t, VNx8BI, unsigned, 8, uint16, 21)
+  ENTRY (Pred4x4_t, VNx4BI, unsigned, 4, uint16, 21)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index f967239..98ff238 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3446,6 +3446,8 @@ arm_configure_build_target (struct arm_build_target
*target,
   arm_option_reconfigure_globals ();
 }

+poly_uint16 arm_vg;
+
 /* Fix up any incompatible options that the user has specified.  */
 static void
 arm_option_override (void)
@@ -3458,6 +3460,7 @@ arm_option_override (void)
   static const enum isa_feature quirk_bitlist[] = { ISA_ALL_QUIRKS,
isa_nobit};
   cl_target_option opts;

+  arm_vg = 2;
   isa_quirkbits = sbitmap_alloc (isa_num_bits);
   arm_initialize_isa (isa_quirkbits, quirk_bitlist);

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 8e5bd57..df9bbb2 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -2493,4 +2493,7 @@ const char *arm_be8_option (int argc, const char **argv);
representation for SHF_ARM_PURECODE in GCC.  */
 #define SECTION_ARM_PURECODE SECTION_MACH_DEP

+#ifndef USED_FOR_TARGET
+extern poly_uint16 arm_vg;
+#endif
 #endif /* ! GCC_ARM_H */
diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 83f1003..765ec5a 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -3524,7 +3524,7 @@ __arm_vaddlvq_u32 (uint32x4_t __a)
   return __builtin_mve_vaddlvq_uv4si (__a);
 }

-__extension__ extern __inline int64_t
+__extension__ extern __inline mve_pred16_t/*int64_t*/
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vctp16q (uint32_t __a)
 {
diff --git a/gcc/config/arm/arm_mve_types.h b/gcc/config/arm/arm_mve_types.h
index 8958f4e..536e816 100644
--- a/gcc/config/arm/arm_mve_types.h
+++ b/gcc/config/arm/arm_mve_types.h
@@ -34,7 +34,8 @@ typedef struct { float32x4_t val[2]; } float32x4x2_t;
 typedef struct { float32x4_t val[4]; } float32x4x4_t;
 #endif

-typedef uint16_t mve_pred16_t;
+//typedef uint16_t mve_pred16_t;
+typedef __simd16_uint16_t mve_pred16_t;
 typedef __simd128_uint8_t uint8x16_t;
 typedef __simd128_uint16_t uint16x8_t;
 typedef __simd128_uint32_t uint32x4_t;
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 5c4fe89..2656f6b 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -948,6 +948,8 @@ (define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16")
(V4SI "32")
   (V8HF "u16") (V4SF "32")])
 (define_mode_attr earlyclobber_32 [(V16QI "=w") (V8HI "=w") (V4SI "=")
(V8HF "=w") (V4SF "=")])
+;;(define_mode_attr MVE_VPRED [(V16QI "VNx16BI") (V8HI "VNx8BI") (V4SI
"VNx4BI")])
+(define_mode_attr MVE_VPRED [(V16QI "VNx16BI") (V8HI "VNx16BI") (V4SI
"VNx16BI") (V8HF "VNx16BI") (V4SF "VNx16BI")])

 

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #11 from Richard Earnshaw  ---
(In reply to Christophe Lyon from comment #10)
> This was introduced by my change at r12-671 in mve.md:
> -;; [vcmpneq_])
> +;; [vcmpneq_, vcmpcsq_, vcmpeqq_, vcmpgeq_, vcmpgtq_, vcmphiq_, vcmpleq_,
> vcmpltq_])
>  ;;
> -(define_insn "mve_vcmpneq_"
> +(define_insn "mve_vcmpq_"
>[
> (set (match_operand:HI 0 "vpr_register_operand" "=Up")
> -   (unspec:HI [(match_operand:MVE_2 1 "s_register_operand" "w")
> -   (match_operand:MVE_2 2 "s_register_operand" "w")]
> -VCMPNEQ))
> +   (MVE_COMPARISONS:HI (match_operand:MVE_2 1 "s_register_operand" "w")
> +   (match_operand:MVE_2 2 "s_register_operand" "w")))
> +  ]
> +  "TARGET_HAVE_MVE"
> +  "vcmp.%#  , %q1, %q2"
> +  [(set_attr "type" "mve_move")
> +])
> 
> So should that use MVE_COMPARISONS:MVE_2, and somehow include a conversion
> from MVE_2 modes into HI for vpr.p0 in the pattern?

Thinking about it some more, it might be the use of HI on the 'eq' that is
incorrect.  If the result were a vector, then the construct might have more
meaning.

I think you'll need to look at how predicates are handled in the aarch64 SVE
code.  For example, you might need to introduce the concept of
VECTOR_BOOL_MODEs.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #10 from Christophe Lyon  ---
This was introduced by my change at r12-671 in mve.md:
-;; [vcmpneq_])
+;; [vcmpneq_, vcmpcsq_, vcmpeqq_, vcmpgeq_, vcmpgtq_, vcmphiq_, vcmpleq_,
vcmpltq_])
 ;;
-(define_insn "mve_vcmpneq_"
+(define_insn "mve_vcmpq_"
   [
(set (match_operand:HI 0 "vpr_register_operand" "=Up")
-   (unspec:HI [(match_operand:MVE_2 1 "s_register_operand" "w")
-   (match_operand:MVE_2 2 "s_register_operand" "w")]
-VCMPNEQ))
+   (MVE_COMPARISONS:HI (match_operand:MVE_2 1 "s_register_operand" "w")
+   (match_operand:MVE_2 2 "s_register_operand" "w")))
+  ]
+  "TARGET_HAVE_MVE"
+  "vcmp.%#  , %q1, %q2"
+  [(set_attr "type" "mve_move")
+])

So should that use MVE_COMPARISONS:MVE_2, and somehow include a conversion from
MVE_2 modes into HI for vpr.p0 in the pattern?

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread rearnsha at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #9 from Richard Earnshaw  ---
(insn 7 4 8 2 (set (reg:HI 117)
(eq:HI (reg:V16QI 119)
(reg:V16QI 120))) {mve_vcmpeqq_v16qi}
 (expr_list:REG_DEAD (reg:V16QI 120)
(expr_list:REG_DEAD (reg:V16QI 119)
(nil

This is wrong.  EQ has a result of 0 or 1, regardless of the mode.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #8 from Christophe Lyon  ---
Indeed, it's what happens in try_combine():
i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);

converts i2src
(zero_extend:SI (reg:HI 117))
into:
(and:SI (subreg:SI (reg:HI 117) 0)
(const_int 1 [0x1]))

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #7 from Christophe Lyon  ---
Before the patch:
Trying 8 -> 14:
8: r113:SI=zero_extend(r117:HI)
  REG_DEAD r117:HI
   14: r0:SI=r113:SI
  REG_DEAD r113:SI
Successfully matched this instruction:
(set (reg/i:SI 0 r0)
(zero_extend:SI (reg:HI 117)))
allowing combination of insns 8 and 14
original costs 4 + 2 = 6
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i314: r0:SI=zero_extend(r117:HI)
  REG_DEAD r117:HI
deferring rescan insn with uid = 14.

After:
Trying 8 -> 14:
8: r113:SI=zero_extend(r117:HI)
  REG_DEAD r117:HI
   14: r0:SI=r113:SI
  REG_DEAD r113:SI
Successfully matched this instruction:
(set (reg/i:SI 0 r0)
(and:SI (subreg:SI (reg:HI 117) 0)
(const_int 1 [0x1])))
allowing combination of insns 8 and 14
original costs 4 + 2 = 6
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i314: r0:SI=r117:HI#0&0x1
  REG_DEAD r117:HI
deferring rescan insn with uid = 14.

So with the same inputs, combine makes a different decision, I guess that's
because it has some knowledge that reg:HI 117 is the result of a comparison
rather than an unspec.

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #6 from Christophe Lyon  ---
Before r12-671 before combine we have:
(insn 7 4 8 2 (set (reg:HI 117)
(unspec:HI [
(reg/v:V16QI 115 [ v ])
(reg/v:V16QI 116 [ w ])
] VCMPEQQ_S)) "arm_mve.h":4210:10 3228 {mve_vcmpeqq_v16qi}
 (expr_list:REG_DEAD (reg/v:V16QI 116 [ w ])
(expr_list:REG_DEAD (reg/v:V16QI 115 [ v ])
(nil
(insn 8 7 14 2 (set (reg:SI 113 [ _5 ])
(zero_extend:SI (reg:HI 117))) "arm_mve.h":4210:10 1019
{*thumb2_zero_extendhisi2_v6}
 (expr_list:REG_DEAD (reg:HI 117)
(nil)))

After r12-671, we have:
(insn 7 4 8 2 (set (reg:HI 117)
(eq:HI (reg/v:V16QI 115 [ v ])
(reg/v:V16QI 116 [ w ]))) "arm_mve.h":4210:10 3173
{mve_vcmpeqq_v16qi}
 (expr_list:REG_DEAD (reg/v:V16QI 116 [ w ])
(expr_list:REG_DEAD (reg/v:V16QI 115 [ v ])
(nil
(insn 8 7 14 2 (set (reg:SI 113 [ _5 ])
(zero_extend:SI (reg:HI 117))) "arm_mve.h":4210:10 1019
{*thumb2_zero_extendhisi2_v6}
 (expr_list:REG_DEAD (reg:HI 117)
(nil)))

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #5 from Christophe Lyon  ---
Before the patch, combine says:
allowing combination of insns 8 and 14
original costs 4 + 2 = 6
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i314: r0:SI=zero_extend(r117:HI)
  REG_DEAD r117:HI
deferring rescan insn with uid = 14.
starting the processing of deferred insns
rescanning insn with uid = 7.
rescanning insn with uid = 14.
ending the processing of deferred insns

After the patch:
allowing combination of insns 8 and 14
original costs 4 + 2 = 6
replacement cost 4
deferring deletion of insn with uid = 8.
modifying insn i314: r0:SI=r117:HI#0&0x1
  REG_DEAD r117:HI
deferring rescan insn with uid = 14.
starting the processing of deferred insns
rescanning insn with uid = 7.
rescanning insn with uid = 14.
ending the processing of deferred insns

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-07 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Christophe Lyon  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2021-07-07
 Ever confirmed|0   |1

--- Comment #4 from Christophe Lyon  ---
Before r12-671, in combine we have:
(insn 7 4 8 2 (set (reg:HI 117)
(unspec:HI [
(reg:V16QI 119)
(reg:V16QI 120)
] VCMPEQQ_S)) {mve_vcmpeqq_v16qi}
 (expr_list:REG_DEAD (reg:V16QI 120)
(expr_list:REG_DEAD (reg:V16QI 119)
(nil
(note 8 7 14 2 NOTE_INSN_DELETED)
(insn 14 8 15 2 (set (reg/i:SI 0 r0)
(zero_extend:SI (reg:HI 117))) "pr101325.c":7:1 1019
{*thumb2_zero_extendhisi2_v6}
 (expr_list:REG_DEAD (reg:HI 117)
(nil)))


After the patch:
(insn 7 4 8 2 (set (reg:HI 117)
(eq:HI (reg:V16QI 119)
(reg:V16QI 120))) {mve_vcmpeqq_v16qi}
 (expr_list:REG_DEAD (reg:V16QI 120)
(expr_list:REG_DEAD (reg:V16QI 119)
(nil
(note 8 7 14 2 NOTE_INSN_DELETED)
(insn 14 8 15 2 (set (reg/i:SI 0 r0)
(and:SI (subreg:SI (reg:HI 117) 0)
(const_int 1 [0x1]))) "pr101325.c":7:1 90 {*arm_andsi3_insn}
 (expr_list:REG_DEAD (reg:HI 117)
(nil)))

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-06 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Christophe Lyon  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |clyon at gcc dot gnu.org

--- Comment #3 from Christophe Lyon  ---
I will have a look

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic since r12-671-gd083fbf72

2021-07-06 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Alex Coplan  changed:

   What|Removed |Added

Summary|[12 Regression] arm: Wrong  |[12 Regression] arm: Wrong
   |code with MVE vcmpeqq   |code with MVE vcmpeqq
   |intrinsic   |intrinsic since
   ||r12-671-gd083fbf72
 CC||clyon at gcc dot gnu.org

--- Comment #2 from Alex Coplan  ---
Started with r12-671-gd083fbf72d4533d2009c725524983e1184981e74:

commit d083fbf72d4533d2009c725524983e1184981e74
Author: Christophe Lyon 
Date:   Mon May 10 13:52:02 2021

arm: MVE: Factorize all vcmp* integer patterns

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic

2021-07-06 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |12.0

[Bug target/101325] [12 Regression] arm: Wrong code with MVE vcmpeqq intrinsic

2021-07-05 Thread acoplan at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101325

--- Comment #1 from Alex Coplan  ---
Execution test for the testsuite:

#include 

__attribute((noinline,noipa))
unsigned foo(int8x16_t v, int8x16_t w)
{
return vcmpeqq (v, w);
}

int main(void)
{
if (foo (vdupq_n_s8(0), vdupq_n_s8(0)) != 0xU)
__builtin_abort ();
}