On December 12, 2019 4:10:33 PM GMT+01:00, Richard Sandiford 
<richard.sandif...@arm.com> wrote:
>One problem with adding an N-bit vector extension to an existing
>architecture is to decide how N-bit vectors should be passed to
>functions and returned from functions.  Allowing all N-bit vector
>types to be passed in registers breaks backwards compatibility,
>since N-bit vectors could be used (and emulated) before the vector
>extension was added.  But always passing N-bit vectors on the
>stack would be inefficient for things like vector libm functions.
>
>For SVE we took the compromise position of predefining new SVE vector
>types that are distinct from all existing vector types, including
>GNU-style vectors.  The new types are passed and returned in an
>efficient way while existing vector types are passed and returned
>in the traditional way.  In the right circumstances, the two types
>are inter-convertible.
>
>The SVE types are created using:
>
>      vectype = build_distinct_type_copy (vectype);
>      SET_TYPE_STRUCTURAL_EQUALITY (vectype);
>      TYPE_ARTIFICIAL (vectype) = 1;
>
>The C frontend maintains this distinction, using VIEW_CONVERT_EXPR
>to convert from one type to the other.  However, the distinction can
>be lost during gimple, which treats two vector types with the same
>mode, number of elements, and element type as equivalent.  And for
>most targets that's the right thing to do.

And why's that a problem? The difference appears only in the function call ABI 
which is determined by the function signature rather than types or modes of the 
actual arguments? 

Richard. 

>This patch therefore adds a hook that lets the target choose
>whether such vector types are indeed equivalent.
>
>Note that the new tests fail for -mabi=ilp32 in the same way as other
>ACLE-based tests.  I'm still planning to fix that as a follow-on.
>
>Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
>Richard
>
>
>2019-12-12  Richard Sandiford  <richard.sandif...@arm.com>
>
>gcc/
>       * target.def (compatible_vector_types_p): New target hook.
>       * hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>       * hooks.c (hook_bool_const_tree_const_tree_true): New function.
>       * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>       * doc/tm.texi: Regenerate.
>       * gimple-expr.c: Include target.h.
>       (useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>       * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>       function.
>       (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>       * config/aarch64/aarch64-sve-builtins.cc
>(gimple_folder::convert_pred):
>       Use the original predicate if it already has a suitable type.
>
>gcc/testsuite/
>       * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>       * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>
>Index: gcc/target.def
>===================================================================
>--- gcc/target.def     2019-11-30 18:48:18.531984101 +0000
>+++ gcc/target.def     2019-12-12 15:07:43.960415368 +0000
>@@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>  hook_bool_mode_false)
> 
> DEFHOOK
>+(compatible_vector_types_p,
>+ "Return true if there is no target-specific reason for treating\n\
>+vector types @var{type1} and @var{type2} as distinct types.  The
>caller\n\
>+has already checked for target-independent reasons, meaning that
>the\n\
>+types are known to have the same mode, to have the same number of
>elements,\n\
>+and to have what the caller considers to be compatible element
>types.\n\
>+\n\
>+The main reason for defining this hook is to reject pairs of types\n\
>+that are handled differently by the target's calling convention.\n\
>+For example, when a new @var{N}-bit vector architecture is added\n\
>+to a target, the target may want to handle normal @var{N}-bit\n\
>+@code{VECTOR_TYPE} arguments and return values in the same way as\n\
>+before, to maintain backwards compatibility.  However, it may also\n\
>+provide new, architecture-specific @code{VECTOR_TYPE}s that are
>passed\n\
>+and returned in a more efficient way.  It is then important to
>maintain\n\
>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the
>new\n\
>+architecture-specific ones.\n\
>+\n\
>+The default implementation returns true, which is correct for most
>targets.",
>+ bool, (const_tree type1, const_tree type2),
>+ hook_bool_const_tree_const_tree_true)
>+
>+DEFHOOK
> (vector_alignment,
> "This hook can be used to define the alignment for a vector of type\n\
>@var{type}, in order to comply with a platform ABI.  The default is
>to\n\
>Index: gcc/hooks.h
>===================================================================
>--- gcc/hooks.h        2019-11-04 21:13:57.727755548 +0000
>+++ gcc/hooks.h        2019-12-12 15:07:43.960415368 +0000
>@@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
> extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
> extern bool hook_bool_tree_false (tree);
> extern bool hook_bool_const_tree_false (const_tree);
>+extern bool hook_bool_const_tree_const_tree_true (const_tree,
>const_tree);
> extern bool hook_bool_tree_true (tree);
> extern bool hook_bool_const_tree_true (const_tree);
> extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
>Index: gcc/hooks.c
>===================================================================
>--- gcc/hooks.c        2019-11-04 21:13:57.727755548 +0000
>+++ gcc/hooks.c        2019-12-12 15:07:43.960415368 +0000
>@@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
> }
> 
> bool
>+hook_bool_const_tree_const_tree_true (const_tree, const_tree)
>+{
>+  return true;
>+}
>+
>+bool
> hook_bool_tree_true (tree)
> {
>   return true;
>Index: gcc/doc/tm.texi.in
>===================================================================
>--- gcc/doc/tm.texi.in 2019-11-30 18:48:18.523984157 +0000
>+++ gcc/doc/tm.texi.in 2019-12-12 15:07:43.956415393 +0000
>@@ -3365,6 +3365,8 @@ stack.
> 
> @hook TARGET_VECTOR_MODE_SUPPORTED_P
> 
>+@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
>+
> @hook TARGET_ARRAY_MODE
> 
> @hook TARGET_ARRAY_MODE_SUPPORTED_P
>Index: gcc/doc/tm.texi
>===================================================================
>--- gcc/doc/tm.texi    2019-11-30 18:48:18.507984271 +0000
>+++ gcc/doc/tm.texi    2019-12-12 15:07:43.952415419 +0000
>@@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
> must have move patterns for this mode.
> @end deftypefn
> 
>+@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P
>(const_tree @var{type1}, const_tree @var{type2})
>+Return true if there is no target-specific reason for treating
>+vector types @var{type1} and @var{type2} as distinct types.  The
>caller
>+has already checked for target-independent reasons, meaning that the
>+types are known to have the same mode, to have the same number of
>elements,
>+and to have what the caller considers to be compatible element types.
>+
>+The main reason for defining this hook is to reject pairs of types
>+that are handled differently by the target's calling convention.
>+For example, when a new @var{N}-bit vector architecture is added
>+to a target, the target may want to handle normal @var{N}-bit
>+@code{VECTOR_TYPE} arguments and return values in the same way as
>+before, to maintain backwards compatibility.  However, it may also
>+provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
>+and returned in a more efficient way.  It is then important to
>maintain
>+a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
>+architecture-specific ones.
>+
>+The default implementation returns true, which is correct for most
>targets.
>+@end deftypefn
>+
>@deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE
>(machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
> Return the mode that GCC should use for an array that has
> @var{nelems} elements, with each element having mode @var{mode}.
>Index: gcc/gimple-expr.c
>===================================================================
>--- gcc/gimple-expr.c  2019-10-08 09:23:31.902529513 +0100
>+++ gcc/gimple-expr.c  2019-12-12 15:07:43.956415393 +0000
>@@ -37,6 +37,7 @@ Software Foundation; either version 3, o
> #include "tree-pass.h"
> #include "stringpool.h"
> #include "attribs.h"
>+#include "target.h"
> 
> /* ----- Type related -----  */
> 
>@@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
> 
>   /* Recurse for vector types with the same number of subparts.  */
>   else if (TREE_CODE (inner_type) == VECTOR_TYPE
>-         && TREE_CODE (outer_type) == VECTOR_TYPE
>-         && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
>-    return useless_type_conversion_p (TREE_TYPE (outer_type),
>-                                    TREE_TYPE (inner_type));
>+         && TREE_CODE (outer_type) == VECTOR_TYPE)
>+    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
>+                    TYPE_VECTOR_SUBPARTS (outer_type))
>+          && useless_type_conversion_p (TREE_TYPE (outer_type),
>+                                        TREE_TYPE (inner_type))
>+          && targetm.compatible_vector_types_p (inner_type, outer_type));
> 
>   else if (TREE_CODE (inner_type) == ARRAY_TYPE
>          && TREE_CODE (outer_type) == ARRAY_TYPE)
>Index: gcc/config/aarch64/aarch64.c
>===================================================================
>--- gcc/config/aarch64/aarch64.c       2019-12-10 16:45:56.338226712 +0000
>+++ gcc/config/aarch64/aarch64.c       2019-12-12 15:07:43.940415503 +0000
>@@ -2120,6 +2120,20 @@ aarch64_fntype_abi (const_tree fntype)
>   return default_function_abi;
> }
> 
>+/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
>+
>+static bool
>+aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
>+{
>+  unsigned int num_zr1 = 0, num_pr1 = 0, num_zr2 = 0, num_pr2 = 0;
>+  if (aarch64_sve_argument_p (type1, &num_zr1, &num_pr1)
>+      != aarch64_sve_argument_p (type2, &num_zr2, &num_pr2))
>+    return false;
>+
>+  gcc_assert (num_zr1 == num_zr2 && num_pr1 == num_pr2);
>+  return true;
>+}
>+
> /* Return true if we should emit CFI for register REGNO.  */
> 
> static bool
>@@ -22031,6 +22045,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
> #undef TARGET_VECTOR_MODE_SUPPORTED_P
> #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
> 
>+#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
>+#define TARGET_COMPATIBLE_VECTOR_TYPES_P
>aarch64_compatible_vector_types_p
>+
> #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
> #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>   aarch64_builtin_support_vector_misalignment
>Index: gcc/config/aarch64/aarch64-sve-builtins.cc
>===================================================================
>--- gcc/config/aarch64/aarch64-sve-builtins.cc 2019-12-06
>18:22:12.072859530 +0000
>+++ gcc/config/aarch64/aarch64-sve-builtins.cc 2019-12-12
>15:07:43.936415528 +0000
>@@ -2251,9 +2251,13 @@ tree
> gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
>                            unsigned int argno)
> {
>-  tree predtype = truth_type_for (vectype);
>   tree pred = gimple_call_arg (call, argno);
>-  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
>+  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
>+              TYPE_VECTOR_SUBPARTS (vectype)))
>+    return pred;
>+
>+  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
>+                     truth_type_for (vectype), pred);
> }
> 
> /* Return a pointer to the address in a contiguous load or store,
>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
>===================================================================
>--- /dev/null  2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c   2019-12-12
>15:07:43.972415287 +0000
>@@ -0,0 +1,99 @@
>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>+
>+#include <arm_sve.h>
>+
>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>+
>+void float16_callee (float16x16_t);
>+void float32_callee (float32x8_t);
>+void float64_callee (float64x4_t);
>+void int8_callee (int8x32_t);
>+void int16_callee (int16x16_t);
>+void int32_callee (int32x8_t);
>+void int64_callee (int64x4_t);
>+void uint8_callee (uint8x32_t);
>+void uint16_callee (uint16x16_t);
>+void uint32_callee (uint32x8_t);
>+void uint64_callee (uint64x4_t);
>+
>+void
>+float16_caller (void)
>+{
>+  float16_callee (svdup_f16 (1.0));
>+}
>+
>+void
>+float32_caller (void)
>+{
>+  float32_callee (svdup_f32 (2.0));
>+}
>+
>+void
>+float64_caller (void)
>+{
>+  float64_callee (svdup_f64 (3.0));
>+}
>+
>+void
>+int8_caller (void)
>+{
>+  int8_callee (svindex_s8 (0, 1));
>+}
>+
>+void
>+int16_caller (void)
>+{
>+  int16_callee (svindex_s16 (0, 2));
>+}
>+
>+void
>+int32_caller (void)
>+{
>+  int32_callee (svindex_s32 (0, 3));
>+}
>+
>+void
>+int64_caller (void)
>+{
>+  int64_callee (svindex_s64 (0, 4));
>+}
>+
>+void
>+uint8_caller (void)
>+{
>+  uint8_callee (svindex_u8 (1, 1));
>+}
>+
>+void
>+uint16_caller (void)
>+{
>+  uint16_callee (svindex_u16 (1, 2));
>+}
>+
>+void
>+uint32_caller (void)
>+{
>+  uint32_callee (svindex_u32 (1, 3));
>+}
>+
>+void
>+uint64_caller (void)
>+{
>+  uint64_callee (svindex_u64 (1, 4));
>+}
>+
>+/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7],
>\[x0\]} 2 } } */
>+/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7],
>\[x0\]} 3 } } */
>+/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
>Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
>===================================================================
>--- /dev/null  2019-09-17 11:41:18.176664108 +0100
>+++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c   2019-12-12
>15:07:43.972415287 +0000
>@@ -0,0 +1,99 @@
>+/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
>+
>+#include <arm_sve.h>
>+
>+typedef float16_t float16x16_t __attribute__((vector_size (32)));
>+typedef float32_t float32x8_t __attribute__((vector_size (32)));
>+typedef float64_t float64x4_t __attribute__((vector_size (32)));
>+typedef int8_t int8x32_t __attribute__((vector_size (32)));
>+typedef int16_t int16x16_t __attribute__((vector_size (32)));
>+typedef int32_t int32x8_t __attribute__((vector_size (32)));
>+typedef int64_t int64x4_t __attribute__((vector_size (32)));
>+typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
>+typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
>+typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
>+typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
>+
>+void float16_callee (svfloat16_t);
>+void float32_callee (svfloat32_t);
>+void float64_callee (svfloat64_t);
>+void int8_callee (svint8_t);
>+void int16_callee (svint16_t);
>+void int32_callee (svint32_t);
>+void int64_callee (svint64_t);
>+void uint8_callee (svuint8_t);
>+void uint16_callee (svuint16_t);
>+void uint32_callee (svuint32_t);
>+void uint64_callee (svuint64_t);
>+
>+void
>+float16_caller (float16x16_t arg)
>+{
>+  float16_callee (arg);
>+}
>+
>+void
>+float32_caller (float32x8_t arg)
>+{
>+  float32_callee (arg);
>+}
>+
>+void
>+float64_caller (float64x4_t arg)
>+{
>+  float64_callee (arg);
>+}
>+
>+void
>+int8_caller (int8x32_t arg)
>+{
>+  int8_callee (arg);
>+}
>+
>+void
>+int16_caller (int16x16_t arg)
>+{
>+  int16_callee (arg);
>+}
>+
>+void
>+int32_caller (int32x8_t arg)
>+{
>+  int32_callee (arg);
>+}
>+
>+void
>+int64_caller (int64x4_t arg)
>+{
>+  int64_callee (arg);
>+}
>+
>+void
>+uint8_caller (uint8x32_t arg)
>+{
>+  uint8_callee (arg);
>+}
>+
>+void
>+uint16_caller (uint16x16_t arg)
>+{
>+  uint16_callee (arg);
>+}
>+
>+void
>+uint32_caller (uint32x8_t arg)
>+{
>+  uint32_callee (arg);
>+}
>+
>+void
>+uint64_caller (uint64x4_t arg)
>+{
>+  uint64_callee (arg);
>+}
>+
>+/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]}
>2 } } */
>+/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]}
>3 } } */
>+/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

Reply via email to