Re: Optimise constant IFN_WHILE_ULTs

2019-08-13 Thread Jeff Law
On 8/13/19 4:54 AM, Richard Sandiford wrote:
> This patch is a combination of two changes that have to be committed as
> a single unit, one target-independent and one target-specific:
> 
> (1) Try to fold IFN_WHILE_ULTs with constant arguments to a VECTOR_CST
> (which is always possible for fixed-length vectors but is not
> necessarily so for variable-length vectors)
> 
> (2) Make the SVE port recognise constants that map to PTRUE VLn,
> which includes those generated by the new fold.
> 
> (2) can't be tested without (1) and (1) would be a significant
> pessimisation without (2).
> 
> The target-specific parts also start moving towards doing predicate
> manipulation in a canonical VNx16BImode form, using rtx_vector_builders.
> 
> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and
> x86_64-linux-gnu.  OK for the generic bits (= the first three files
> in the diff)?
> 
> Thanks,
> Richard
> 
> 
> 2019-08-13  Richard Sandiford  
> 
> gcc/
>   * tree.h (build_vector_a_then_b): Declare.
>   * tree.c (build_vector_a_then_b): New function.
>   * fold-const-call.c (fold_while_ult): Likewise.
>   (fold_const_call): Use it to handle IFN_WHILE_ULT.
>   * config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPATTERN): New macro.
>   (aarch64_svpattern): New enum.
>   * config/aarch64/aarch64-sve.md (mov): Pass
>   constants through aarch64_expand_mov_immediate.
>   (*aarch64_sve_mov): Use aarch64_mov_operand rather
>   than general_operand as the predicate for operand 1.
>   (while_ult): Add a '@' marker.
>   * config/aarch64/aarch64.c (simd_immediate_info::PTRUE): New
>   insn_type.
>   (simd_immediate_info::simd_immediate_info): New overload that
>   takes a scalar_int_mode and an svpattern.
>   (simd_immediate_info::u): Add a "pattern" field.
>   (svpattern_token): New function.
>   (aarch64_get_sve_pred_bits, aarch64_widest_sve_pred_elt_size)
>   (aarch64_partial_ptrue_length, aarch64_svpattern_for_vl)
>   (aarch64_sve_move_pred_via_while): New functions.
>   (aarch64_expand_mov_immediate): Try using
>   aarch64_sve_move_pred_via_while for predicates that contain N ones
>   followed by M zeros but that do not correspond to a VLnnn pattern.
>   (aarch64_sve_pred_valid_immediate): New function.
>   (aarch64_simd_valid_immediate): Use it instead of dealing directly
>   with PTRUE and PFALSE.
>   (aarch64_output_sve_mov_immediate): Handle new simd_immediate_info
>   forms.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve/spill_2.c: Increase iteration counts
>   beyond the range of a PTRUE.
>   * gcc.target/aarch64/sve/while_6.c: New test.
>   * gcc.target/aarch64/sve/while_7.c: Likewise.
>   * gcc.target/aarch64/sve/while_8.c: Likewise.
>   * gcc.target/aarch64/sve/while_9.c: Likewise.
>   * gcc.target/aarch64/sve/while_10.c: Likewise.
> 
Generic bits are fine.

jeff


Optimise constant IFN_WHILE_ULTs

2019-08-13 Thread Richard Sandiford
This patch is a combination of two changes that have to be committed as
a single unit, one target-independent and one target-specific:

(1) Try to fold IFN_WHILE_ULTs with constant arguments to a VECTOR_CST
(which is always possible for fixed-length vectors but is not
necessarily so for variable-length vectors)

(2) Make the SVE port recognise constants that map to PTRUE VLn,
which includes those generated by the new fold.

(2) can't be tested without (1) and (1) would be a significant
pessimisation without (2).

The target-specific parts also start moving towards doing predicate
manipulation in a canonical VNx16BImode form, using rtx_vector_builders.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and
x86_64-linux-gnu.  OK for the generic bits (= the first three files
in the diff)?

Thanks,
Richard


2019-08-13  Richard Sandiford  

gcc/
* tree.h (build_vector_a_then_b): Declare.
* tree.c (build_vector_a_then_b): New function.
* fold-const-call.c (fold_while_ult): Likewise.
(fold_const_call): Use it to handle IFN_WHILE_ULT.
* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPATTERN): New macro.
(aarch64_svpattern): New enum.
* config/aarch64/aarch64-sve.md (mov): Pass
constants through aarch64_expand_mov_immediate.
(*aarch64_sve_mov): Use aarch64_mov_operand rather
than general_operand as the predicate for operand 1.
(while_ult): Add a '@' marker.
* config/aarch64/aarch64.c (simd_immediate_info::PTRUE): New
insn_type.
(simd_immediate_info::simd_immediate_info): New overload that
takes a scalar_int_mode and an svpattern.
(simd_immediate_info::u): Add a "pattern" field.
(svpattern_token): New function.
(aarch64_get_sve_pred_bits, aarch64_widest_sve_pred_elt_size)
(aarch64_partial_ptrue_length, aarch64_svpattern_for_vl)
(aarch64_sve_move_pred_via_while): New functions.
(aarch64_expand_mov_immediate): Try using
aarch64_sve_move_pred_via_while for predicates that contain N ones
followed by M zeros but that do not correspond to a VLnnn pattern.
(aarch64_sve_pred_valid_immediate): New function.
(aarch64_simd_valid_immediate): Use it instead of dealing directly
with PTRUE and PFALSE.
(aarch64_output_sve_mov_immediate): Handle new simd_immediate_info
forms.

gcc/testsuite/
* gcc.target/aarch64/sve/spill_2.c: Increase iteration counts
beyond the range of a PTRUE.
* gcc.target/aarch64/sve/while_6.c: New test.
* gcc.target/aarch64/sve/while_7.c: Likewise.
* gcc.target/aarch64/sve/while_8.c: Likewise.
* gcc.target/aarch64/sve/while_9.c: Likewise.
* gcc.target/aarch64/sve/while_10.c: Likewise.

Index: gcc/tree.h
===
--- gcc/tree.h  2019-08-13 10:38:29.003944679 +0100
+++ gcc/tree.h  2019-08-13 11:46:02.110720978 +0100
@@ -4314,6 +4314,7 @@ extern tree build_vector_from_val (tree,
 extern tree build_uniform_cst (tree, tree);
 extern tree build_vec_series (tree, tree, tree);
 extern tree build_index_vector (tree, poly_uint64, poly_uint64);
+extern tree build_vector_a_then_b (tree, unsigned int, tree, tree);
 extern void recompute_constructor_flags (tree);
 extern void verify_constructor_flags (tree);
 extern tree build_constructor (tree, vec * 
CXX_MEM_STAT_INFO);
Index: gcc/tree.c
===
--- gcc/tree.c  2019-08-13 10:38:29.003944679 +0100
+++ gcc/tree.c  2019-08-13 11:46:02.110720978 +0100
@@ -1981,6 +1981,23 @@ build_index_vector (tree vec_type, poly_
   return v.build ();
 }
 
+/* Return a VECTOR_CST of type VEC_TYPE in which the first NUM_A
+   elements are A and the rest are B.  */
+
+tree
+build_vector_a_then_b (tree vec_type, unsigned int num_a, tree a, tree b)
+{
+  gcc_assert (known_le (num_a, TYPE_VECTOR_SUBPARTS (vec_type)));
+  unsigned int count = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vec_type));
+  /* Optimize the constant case.  */
+  if ((count & 1) == 0 && TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
+count /= 2;
+  tree_vector_builder builder (vec_type, count, 2);
+  for (unsigned int i = 0; i < count * 2; ++i)
+builder.quick_push (i < num_a ? a : b);
+  return builder.build ();
+}
+
 /* Something has messed with the elements of CONSTRUCTOR C after it was built;
calculate TREE_CONSTANT and TREE_SIDE_EFFECTS.  */
 
Index: gcc/fold-const-call.c
===
--- gcc/fold-const-call.c   2019-03-08 18:15:31.292760907 +
+++ gcc/fold-const-call.c   2019-08-13 11:46:02.106721009 +0100
@@ -691,6 +691,36 @@ fold_const_vec_convert (tree ret_type, t
 
 /* Try to evaluate:
 
+  IFN_WHILE_ULT (ARG0, ARG1, (TYPE) { ... })
+
+   Return the value on success and null on failure.  */
+
+static tree