On Tue, Jan 7, 2020 at 11:33 AM Richard Sandiford
<richard.sandif...@arm.com> wrote:
>
> Richard Sandiford <richard.sandif...@arm.com> writes:
> > Richard Biener <richard.guent...@gmail.com> writes:
> >> On December 14, 2019 11:43:48 AM GMT+01:00, Richard Sandiford 
> >> <richard.sandif...@arm.com> wrote:
> >>>Richard Biener <richard.guent...@gmail.com> writes:
> >>>> On December 13, 2019 10:12:40 AM GMT+01:00, Richard Sandiford
> >>><richard.sandif...@arm.com> wrote:
> >>>>>Richard Biener <richard.guent...@gmail.com> writes:
> >>>>>>>>>The AArch64 port emits an error if calls pass values of SVE type
> >>>to
> >>>>>>>an
> >>>>>>>>>unprototyped function.  To do that we need to know whether the
> >>>>>value
> >>>>>>>>>really is an SVE type rathr than a plain vector.
> >>>>>>>>>
> >>>>>>>>>For varags the ABI is the same for 256 bits+.  But we'll have the
> >>>>>>>>>same problem there once we support -msve-vector-bits=128, since
> >>>the
> >>>>>>>>>layout of SVE and Advanced SIMD vectors differ for big-endian.
> >>>>>>>>
> >>>>>>>> But then why don't you have different modes?
> >>>>>>>
> >>>>>>>Yeah, true, modes will probably help for the Advanced SIMD/SVE
> >>>>>>>difference.  But from a vector value POV, a vector of 4 ints is a
> >>>>>>>vector
> >>>>>>>of 4 ints, so even distinguishing based on the mode is artificial.
> >>>>>>
> >>>>>> True.
> >>>>>>
> >>>>>>>SVE is AFAIK the first target to have different modes for
> >>>potentially
> >>>>>>>the "same" vector type, and I had to add new infrastructure to
> >>>allow
> >>>>>>>targets to define multiple modes of the same size.  So the fact
> >>>that
> >>>>>>>gimple distinguishes otherwise identical vectors based on mode is a
> >>>>>>>relatively recent thing.  AFAIK it just fell out in the wash rather
> >>>>>>>than being deliberately planned.  It happens to be convenient in
> >>>this
> >>>>>>>context, but it hasn't been important until now.
> >>>>>>>
> >>>>>>>The hook doesn't seem any worse than distinguishing based on the
> >>>>>mode.
> >>>>>>>Another way to avoid this would have been to define separate SVE
> >>>>>modes
> >>>>>>>for the predefined vectors.  The big downside of that is that we'd
> >>>>>end
> >>>>>>>up doubling the number of SVE patterns.
> >>>>>>>
> >>>>>>>Extra on-the-side metadata is going to be easy to drop
> >>>accidentally,
> >>>>>>>and this is something we need for correctness rather than
> >>>>>optimisation.
> >>>>>>
> >>>>>> Still selecting the ABI during call expansion only and based on
> >>>>>values types at that point is fragile.
> >>>>>
> >>>>>Agreed.  But it's fragile in general, not just for this case.
> >>>Changing
> >>>>>something as fundamental as that would be a lot of work and seems
> >>>>>likely
> >>>>>to introduce accidental ABI breakage.
> >>>>>
> >>>>>> The frontend are in charge of specifying the actual argument type
> >>>and
> >>>>>> at that point the target may fix the ABI. The ABI can be recorded
> >>>in
> >>>>>> the calls fntype, either via its TYPE_ARG_TYPES or in more awkward
> >>>>>> ways for varargs functions (in full generality that would mean
> >>>>>> attaching varargs ABI meta to each call).
> >>>>>>
> >>>>>> The alternative is to have an actual argument type vector
> >>>associated
> >>>>>> with each call.
> >>>>>
> >>>>>I think multiple pieces of gimple code would then have to cope with
> >>>>>that
> >>>>>as a special case.  E.g. if:
> >>>>>
> >>>>>   void foo (int, ...);
> >>>>>
> >>>>>   type1 a;
> >>>>>   b = VIEW_CONVERT_EXPR<type2> (a);
> >>>>>   if (a)
> >>>>>     foo (1, a);
> >>>>>   else
> >>>>>     foo (1, b);
> >>>>>
> >>>>>gets converted to:
> >>>>>
> >>>>>   if (a)
> >>>>>     foo (1, a);
> >>>>>   else
> >>>>>     foo (1, a);
> >>>>>
> >>>>>on the basis that type1 and type2 are "the same" despite having
> >>>>>different calling conventions, we have to be sure that the calls
> >>>>>are not treated as equivalent:
> >>>>>
> >>>>>   foo (1, a);
> >>>>>
> >>>>>Things like IPA clones would also need to handle this specially.
> >>>>>Anything that generates new calls based on old ones will need
> >>>>>to copy this information too.
> >>>>>
> >>>>>This also sounds like it would be fragile and seems a bit too
> >>>>>invasive for stage 3.
> >>>>
> >>>> But we are already relying on this to work (fntype non-propagation)
> >>>because function pointer conversions are dropped on the floor.
> >>>>
> >>>> The real change would be introducing (per call) fntype for calls to
> >>>unprototyped functions and somehow dealing with varargs.
> >>>
> >>>It looks like this itself relies on useless_type_conversion_p,
> >>>is that right?  E.g. we have things like:
> >>>
> >>>bool
> >>>func_checker::compare_gimple_call (gcall *s1, gcall *s2)
> >>>{
> >>>  ...
> >>>  tree fntype1 = gimple_call_fntype (s1);
> >>>  tree fntype2 = gimple_call_fntype (s2);
> >>>  if ((fntype1 && !fntype2)
> >>>      || (!fntype1 && fntype2)
> >>>      || (fntype1 && !types_compatible_p (fntype1, fntype2)))
> >>>return return_false_with_msg ("call function types are not
> >>>compatible");
> >>>
> >>>and useless_type_conversion_p has:
> >>>
> >>>  else if ((TREE_CODE (inner_type) == FUNCTION_TYPE
> >>>         || TREE_CODE (inner_type) == METHOD_TYPE)
> >>>        && TREE_CODE (inner_type) == TREE_CODE (outer_type))
> >>>    {
> >>>      tree outer_parm, inner_parm;
> >>>
> >>>      /* If the return types are not compatible bail out.  */
> >>>      if (!useless_type_conversion_p (TREE_TYPE (outer_type),
> >>>                                   TREE_TYPE (inner_type)))
> >>>     return false;
> >>>
> >>>      /* Method types should belong to a compatible base class.  */
> >>>      if (TREE_CODE (inner_type) == METHOD_TYPE
> >>>       && !useless_type_conversion_p (TYPE_METHOD_BASETYPE (outer_type),
> >>>                                      TYPE_METHOD_BASETYPE (inner_type)))
> >>>     return false;
> >>>
> >>>      /* A conversion to an unprototyped argument list is ok.  */
> >>>      if (!prototype_p (outer_type))
> >>>     return true;
> >>>
> >>>     /* If the unqualified argument types are compatible the conversion
> >>>      is useless.  */
> >>>      if (TYPE_ARG_TYPES (outer_type) == TYPE_ARG_TYPES (inner_type))
> >>>     return true;
> >>>
> >>>      for (outer_parm = TYPE_ARG_TYPES (outer_type),
> >>>        inner_parm = TYPE_ARG_TYPES (inner_type);
> >>>        outer_parm && inner_parm;
> >>>        outer_parm = TREE_CHAIN (outer_parm),
> >>>        inner_parm = TREE_CHAIN (inner_parm))
> >>>     if (!useless_type_conversion_p
> >>>            (TYPE_MAIN_VARIANT (TREE_VALUE (outer_parm)),
> >>>             TYPE_MAIN_VARIANT (TREE_VALUE (inner_parm))))
> >>>       return false;
> >>>
> >>>So it looks like we'd still need to distinguish the vector types in
> >>>useless_type_conversion_p even if we went the fntype route.  The
> >>>difference
> >>>is that the fntype route would give us the option of only
> >>>distinguishing
> >>>the vectors for return and argument types and not in general.
> >>>
> >>>But if we are going to have to distinguish the vectors here anyway
> >>>in some form, could we go with the patch as-is for stage 3 and leave
> >>>restricting this to just return and argument types as a follow-on
> >>>optimisation?
> >>
> >> How does this get around the LTO canonical type merging machinery? That 
> >> is, how are those types streamed and how are they identified by the 
> >> backend? Just by means of being pointer equal to some statically built 
> >> type in the backend?
> >> Or does the type have some attribute on it or on the component? How does 
> >> the middle end build a related type with the same ABI, like a vector with 
> >> the half number of elements?
> >
> > Hmm...
> >
> > At the moment it's based on pointer equality between the TYPE_MAIN_VARIANT
> > and statically-built types.  We predefine the only available SVE "ABI types"
> > and there's no way to create "new" ones.
> >
> > But you're right that that doesn't work for LTO -- in general, not just
> > for this conversion patch -- because no streamed types end up as ABI types.
> > So we'll need an attribute after all, with the ABI decisions keyed off that
> > rather than TYPE_MAIN_VARIANT pointer equality.  Will fix...
>
> Now fixed :-)
>
> > Once that's fixed, the fact that we use SET_TYPE_STRUCTURAL_EQUALITY
> > for the ABI types means that the types remain distinct from "normal"
> > vector types even for TYPE_CANONICAL purposes, since:
> >
> >      As a special case, if TYPE_CANONICAL is NULL_TREE, and thus
> >      TYPE_STRUCTURAL_EQUALITY_P is true, then it cannot
> >      be used for comparison against other types.  Instead, the type is
> >      said to require structural equality checks, described in
> >      TYPE_STRUCTURAL_EQUALITY_P.
> >      [...]
> >   #define TYPE_CANONICAL(NODE) (TYPE_CHECK (NODE)->type_common.canonical)
> >   /* Indicates that the type node requires structural equality
> >      checks.  The compiler will need to look at the composition of the
> >      type to determine whether it is equal to another type, rather than
> >      just comparing canonical type pointers.  For instance, we would need
> >      to look at the return and parameter types of a FUNCTION_TYPE
> >      node.  */
> >   #define TYPE_STRUCTURAL_EQUALITY_P(NODE) (TYPE_CANONICAL (NODE) == 
> > NULL_TREE)
> >
> > We also have:
> >
> > /* Return ture if get_alias_set care about TYPE_CANONICAL of given type.
> >    We don't define the types for pointers, arrays and vectors.  The reason 
> > is
> >    that pointers are handled specially: ptr_type_node accesses conflict with
> >    accesses to all other pointers.  This is done by alias.c.
> >    Because alias sets of arrays and vectors are the same as types of their
> >    elements, we can't compute canonical type either.  Otherwise we could go
> >    form void *[10] to int *[10] (because they are equivalent for canonical 
> > type
> >    machinery) and get wrong TBAA.  */
> >
> > inline bool
> > canonical_type_used_p (const_tree t)
> > {
> >   return !(POINTER_TYPE_P (t)
> >          || TREE_CODE (t) == ARRAY_TYPE
> >          || TREE_CODE (t) == VECTOR_TYPE);
> > }
> >
> > So with the attribute added (needed anyway), the patch does seem to
> > work for LTO too.
>
> Given the above, is the patch OK?  I agree it isn't very elegant,
> but at the moment we have no choice but to distinguish the vector
> types at some point during gimple.

OK.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> 2020-01-07  Richard Sandiford  <richard.sandif...@arm.com>
>
> gcc/
>         * target.def (compatible_vector_types_p): New target hook.
>         * hooks.h (hook_bool_const_tree_const_tree_true): Declare.
>         * hooks.c (hook_bool_const_tree_const_tree_true): New function.
>         * doc/tm.texi.in (TARGET_COMPATIBLE_VECTOR_TYPES_P): New hook.
>         * doc/tm.texi: Regenerate.
>         * gimple-expr.c: Include target.h.
>         (useless_type_conversion_p): Use targetm.compatible_vector_types_p.
>         * config/aarch64/aarch64.c (aarch64_compatible_vector_types_p): New
>         function.
>         (TARGET_COMPATIBLE_VECTOR_TYPES_P): Define.
>         * config/aarch64/aarch64-sve-builtins.cc 
> (gimple_folder::convert_pred):
>         Use the original predicate if it already has a suitable type.
>
> gcc/testsuite/
>         * gcc.target/aarch64/sve/pcs/gnu_vectors_1.c: New test.
>         * gcc.target/aarch64/sve/pcs/gnu_vectors_2.c: Likewise.
>
> Index: gcc/target.def
> ===================================================================
> --- gcc/target.def      2020-01-06 12:57:55.753930730 +0000
> +++ gcc/target.def      2020-01-07 10:24:01.546344751 +0000
> @@ -3411,6 +3411,29 @@ must have move patterns for this mode.",
>   hook_bool_mode_false)
>
>  DEFHOOK
> +(compatible_vector_types_p,
> + "Return true if there is no target-specific reason for treating\n\
> +vector types @var{type1} and @var{type2} as distinct types.  The caller\n\
> +has already checked for target-independent reasons, meaning that the\n\
> +types are known to have the same mode, to have the same number of 
> elements,\n\
> +and to have what the caller considers to be compatible element types.\n\
> +\n\
> +The main reason for defining this hook is to reject pairs of types\n\
> +that are handled differently by the target's calling convention.\n\
> +For example, when a new @var{N}-bit vector architecture is added\n\
> +to a target, the target may want to handle normal @var{N}-bit\n\
> +@code{VECTOR_TYPE} arguments and return values in the same way as\n\
> +before, to maintain backwards compatibility.  However, it may also\n\
> +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed\n\
> +and returned in a more efficient way.  It is then important to maintain\n\
> +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new\n\
> +architecture-specific ones.\n\
> +\n\
> +The default implementation returns true, which is correct for most targets.",
> + bool, (const_tree type1, const_tree type2),
> + hook_bool_const_tree_const_tree_true)
> +
> +DEFHOOK
>  (vector_alignment,
>   "This hook can be used to define the alignment for a vector of type\n\
>  @var{type}, in order to comply with a platform ABI.  The default is to\n\
> Index: gcc/hooks.h
> ===================================================================
> --- gcc/hooks.h 2020-01-06 12:57:54.749937335 +0000
> +++ gcc/hooks.h 2020-01-07 10:24:01.542344777 +0000
> @@ -45,6 +45,7 @@ extern bool hook_bool_uint_uint_mode_fal
>  extern bool hook_bool_uint_mode_true (unsigned int, machine_mode);
>  extern bool hook_bool_tree_false (tree);
>  extern bool hook_bool_const_tree_false (const_tree);
> +extern bool hook_bool_const_tree_const_tree_true (const_tree, const_tree);
>  extern bool hook_bool_tree_true (tree);
>  extern bool hook_bool_const_tree_true (const_tree);
>  extern bool hook_bool_gsiptr_false (gimple_stmt_iterator *);
> Index: gcc/hooks.c
> ===================================================================
> --- gcc/hooks.c 2020-01-06 12:57:54.745937361 +0000
> +++ gcc/hooks.c 2020-01-07 10:24:01.542344777 +0000
> @@ -313,6 +313,12 @@ hook_bool_const_tree_false (const_tree)
>  }
>
>  bool
> +hook_bool_const_tree_const_tree_true (const_tree, const_tree)
> +{
> +  return true;
> +}
> +
> +bool
>  hook_bool_tree_true (tree)
>  {
>    return true;
> Index: gcc/doc/tm.texi.in
> ===================================================================
> --- gcc/doc/tm.texi.in  2020-01-06 12:57:53.657944518 +0000
> +++ gcc/doc/tm.texi.in  2020-01-07 10:24:01.542344777 +0000
> @@ -3365,6 +3365,8 @@ stack.
>
>  @hook TARGET_VECTOR_MODE_SUPPORTED_P
>
> +@hook TARGET_COMPATIBLE_VECTOR_TYPES_P
> +
>  @hook TARGET_ARRAY_MODE
>
>  @hook TARGET_ARRAY_MODE_SUPPORTED_P
> Index: gcc/doc/tm.texi
> ===================================================================
> --- gcc/doc/tm.texi     2020-01-06 12:57:53.649944570 +0000
> +++ gcc/doc/tm.texi     2020-01-07 10:24:01.542344777 +0000
> @@ -4324,6 +4324,27 @@ insns involving vector mode @var{mode}.
>  must have move patterns for this mode.
>  @end deftypefn
>
> +@deftypefn {Target Hook} bool TARGET_COMPATIBLE_VECTOR_TYPES_P (const_tree 
> @var{type1}, const_tree @var{type2})
> +Return true if there is no target-specific reason for treating
> +vector types @var{type1} and @var{type2} as distinct types.  The caller
> +has already checked for target-independent reasons, meaning that the
> +types are known to have the same mode, to have the same number of elements,
> +and to have what the caller considers to be compatible element types.
> +
> +The main reason for defining this hook is to reject pairs of types
> +that are handled differently by the target's calling convention.
> +For example, when a new @var{N}-bit vector architecture is added
> +to a target, the target may want to handle normal @var{N}-bit
> +@code{VECTOR_TYPE} arguments and return values in the same way as
> +before, to maintain backwards compatibility.  However, it may also
> +provide new, architecture-specific @code{VECTOR_TYPE}s that are passed
> +and returned in a more efficient way.  It is then important to maintain
> +a distinction between the ``normal'' @code{VECTOR_TYPE}s and the new
> +architecture-specific ones.
> +
> +The default implementation returns true, which is correct for most targets.
> +@end deftypefn
> +
>  @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode 
> @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
>  Return the mode that GCC should use for an array that has
>  @var{nelems} elements, with each element having mode @var{mode}.
> Index: gcc/gimple-expr.c
> ===================================================================
> --- gcc/gimple-expr.c   2020-01-06 12:58:10.545833431 +0000
> +++ gcc/gimple-expr.c   2020-01-07 10:24:01.542344777 +0000
> @@ -37,6 +37,7 @@ Software Foundation; either version 3, o
>  #include "tree-pass.h"
>  #include "stringpool.h"
>  #include "attribs.h"
> +#include "target.h"
>
>  /* ----- Type related -----  */
>
> @@ -147,10 +148,12 @@ useless_type_conversion_p (tree outer_ty
>
>    /* Recurse for vector types with the same number of subparts.  */
>    else if (TREE_CODE (inner_type) == VECTOR_TYPE
> -          && TREE_CODE (outer_type) == VECTOR_TYPE
> -          && TYPE_PRECISION (inner_type) == TYPE_PRECISION (outer_type))
> -    return useless_type_conversion_p (TREE_TYPE (outer_type),
> -                                     TREE_TYPE (inner_type));
> +          && TREE_CODE (outer_type) == VECTOR_TYPE)
> +    return (known_eq (TYPE_VECTOR_SUBPARTS (inner_type),
> +                     TYPE_VECTOR_SUBPARTS (outer_type))
> +           && useless_type_conversion_p (TREE_TYPE (outer_type),
> +                                         TREE_TYPE (inner_type))
> +           && targetm.compatible_vector_types_p (inner_type, outer_type));
>
>    else if (TREE_CODE (inner_type) == ARRAY_TYPE
>            && TREE_CODE (outer_type) == ARRAY_TYPE)
> Index: gcc/config/aarch64/aarch64.c
> ===================================================================
> --- gcc/config/aarch64/aarch64.c        2020-01-07 10:18:06.572651552 +0000
> +++ gcc/config/aarch64/aarch64.c        2020-01-07 10:24:01.538344801 +0000
> @@ -2098,6 +2098,15 @@ aarch64_fntype_abi (const_tree fntype)
>    return default_function_abi;
>  }
>
> +/* Implement TARGET_COMPATIBLE_VECTOR_TYPES_P.  */
> +
> +static bool
> +aarch64_compatible_vector_types_p (const_tree type1, const_tree type2)
> +{
> +  return (aarch64_sve::builtin_type_p (type1)
> +         == aarch64_sve::builtin_type_p (type2));
> +}
> +
>  /* Return true if we should emit CFI for register REGNO.  */
>
>  static bool
> @@ -22099,6 +22108,9 @@ #define TARGET_USE_BLOCKS_FOR_CONSTANT_P
>  #undef TARGET_VECTOR_MODE_SUPPORTED_P
>  #define TARGET_VECTOR_MODE_SUPPORTED_P aarch64_vector_mode_supported_p
>
> +#undef TARGET_COMPATIBLE_VECTOR_TYPES_P
> +#define TARGET_COMPATIBLE_VECTOR_TYPES_P aarch64_compatible_vector_types_p
> +
>  #undef TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT
>  #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
>    aarch64_builtin_support_vector_misalignment
> Index: gcc/config/aarch64/aarch64-sve-builtins.cc
> ===================================================================
> --- gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:21:17.575410530 
> +0000
> +++ gcc/config/aarch64/aarch64-sve-builtins.cc  2020-01-07 10:24:01.534344828 
> +0000
> @@ -2265,9 +2265,13 @@ tree
>  gimple_folder::convert_pred (gimple_seq &stmts, tree vectype,
>                              unsigned int argno)
>  {
> -  tree predtype = truth_type_for (vectype);
>    tree pred = gimple_call_arg (call, argno);
> -  return gimple_build (&stmts, VIEW_CONVERT_EXPR, predtype, pred);
> +  if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (pred)),
> +               TYPE_VECTOR_SUBPARTS (vectype)))
> +    return pred;
> +
> +  return gimple_build (&stmts, VIEW_CONVERT_EXPR,
> +                      truth_type_for (vectype), pred);
>  }
>
>  /* Return a pointer to the address in a contiguous load or store,
> Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c
> ===================================================================
> --- /dev/null   2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_1.c    2020-01-07 
> 10:24:01.546344751 +0000
> @@ -0,0 +1,99 @@
> +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
> +
> +#include <arm_sve.h>
> +
> +typedef float16_t float16x16_t __attribute__((vector_size (32)));
> +typedef float32_t float32x8_t __attribute__((vector_size (32)));
> +typedef float64_t float64x4_t __attribute__((vector_size (32)));
> +typedef int8_t int8x32_t __attribute__((vector_size (32)));
> +typedef int16_t int16x16_t __attribute__((vector_size (32)));
> +typedef int32_t int32x8_t __attribute__((vector_size (32)));
> +typedef int64_t int64x4_t __attribute__((vector_size (32)));
> +typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
> +typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
> +typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
> +typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
> +
> +void float16_callee (float16x16_t);
> +void float32_callee (float32x8_t);
> +void float64_callee (float64x4_t);
> +void int8_callee (int8x32_t);
> +void int16_callee (int16x16_t);
> +void int32_callee (int32x8_t);
> +void int64_callee (int64x4_t);
> +void uint8_callee (uint8x32_t);
> +void uint16_callee (uint16x16_t);
> +void uint32_callee (uint32x8_t);
> +void uint64_callee (uint64x4_t);
> +
> +void
> +float16_caller (void)
> +{
> +  float16_callee (svdup_f16 (1.0));
> +}
> +
> +void
> +float32_caller (void)
> +{
> +  float32_callee (svdup_f32 (2.0));
> +}
> +
> +void
> +float64_caller (void)
> +{
> +  float64_callee (svdup_f64 (3.0));
> +}
> +
> +void
> +int8_caller (void)
> +{
> +  int8_callee (svindex_s8 (0, 1));
> +}
> +
> +void
> +int16_caller (void)
> +{
> +  int16_callee (svindex_s16 (0, 2));
> +}
> +
> +void
> +int32_caller (void)
> +{
> +  int32_callee (svindex_s32 (0, 3));
> +}
> +
> +void
> +int64_caller (void)
> +{
> +  int64_callee (svindex_s64 (0, 4));
> +}
> +
> +void
> +uint8_caller (void)
> +{
> +  uint8_callee (svindex_u8 (1, 1));
> +}
> +
> +void
> +uint16_caller (void)
> +{
> +  uint16_callee (svindex_u16 (1, 2));
> +}
> +
> +void
> +uint32_caller (void)
> +{
> +  uint32_callee (svindex_u32 (1, 3));
> +}
> +
> +void
> +uint64_caller (void)
> +{
> +  uint64_callee (svindex_u64 (1, 4));
> +}
> +
> +/* { dg-final { scan-assembler-times {\tst1b\tz[0-9]+\.b, p[0-7], \[x0\]} 2 
> } } */
> +/* { dg-final { scan-assembler-times {\tst1h\tz[0-9]+\.h, p[0-7], \[x0\]} 3 
> } } */
> +/* { dg-final { scan-assembler-times {\tst1w\tz[0-9]+\.s, p[0-7], \[x0\]} 3 
> } } */
> +/* { dg-final { scan-assembler-times {\tst1d\tz[0-9]+\.d, p[0-7], \[x0\]} 3 
> } } */
> +/* { dg-final { scan-assembler-times {\tadd\tx0, sp, #?16\n} 11 } } */
> Index: gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c
> ===================================================================
> --- /dev/null   2019-09-17 11:41:18.176664108 +0100
> +++ gcc/testsuite/gcc.target/aarch64/sve/pcs/gnu_vectors_2.c    2020-01-07 
> 10:24:01.546344751 +0000
> @@ -0,0 +1,99 @@
> +/* { dg-options "-O -msve-vector-bits=256 -fomit-frame-pointer" } */
> +
> +#include <arm_sve.h>
> +
> +typedef float16_t float16x16_t __attribute__((vector_size (32)));
> +typedef float32_t float32x8_t __attribute__((vector_size (32)));
> +typedef float64_t float64x4_t __attribute__((vector_size (32)));
> +typedef int8_t int8x32_t __attribute__((vector_size (32)));
> +typedef int16_t int16x16_t __attribute__((vector_size (32)));
> +typedef int32_t int32x8_t __attribute__((vector_size (32)));
> +typedef int64_t int64x4_t __attribute__((vector_size (32)));
> +typedef uint8_t uint8x32_t __attribute__((vector_size (32)));
> +typedef uint16_t uint16x16_t __attribute__((vector_size (32)));
> +typedef uint32_t uint32x8_t __attribute__((vector_size (32)));
> +typedef uint64_t uint64x4_t __attribute__((vector_size (32)));
> +
> +void float16_callee (svfloat16_t);
> +void float32_callee (svfloat32_t);
> +void float64_callee (svfloat64_t);
> +void int8_callee (svint8_t);
> +void int16_callee (svint16_t);
> +void int32_callee (svint32_t);
> +void int64_callee (svint64_t);
> +void uint8_callee (svuint8_t);
> +void uint16_callee (svuint16_t);
> +void uint32_callee (svuint32_t);
> +void uint64_callee (svuint64_t);
> +
> +void
> +float16_caller (float16x16_t arg)
> +{
> +  float16_callee (arg);
> +}
> +
> +void
> +float32_caller (float32x8_t arg)
> +{
> +  float32_callee (arg);
> +}
> +
> +void
> +float64_caller (float64x4_t arg)
> +{
> +  float64_callee (arg);
> +}
> +
> +void
> +int8_caller (int8x32_t arg)
> +{
> +  int8_callee (arg);
> +}
> +
> +void
> +int16_caller (int16x16_t arg)
> +{
> +  int16_callee (arg);
> +}
> +
> +void
> +int32_caller (int32x8_t arg)
> +{
> +  int32_callee (arg);
> +}
> +
> +void
> +int64_caller (int64x4_t arg)
> +{
> +  int64_callee (arg);
> +}
> +
> +void
> +uint8_caller (uint8x32_t arg)
> +{
> +  uint8_callee (arg);
> +}
> +
> +void
> +uint16_caller (uint16x16_t arg)
> +{
> +  uint16_callee (arg);
> +}
> +
> +void
> +uint32_caller (uint32x8_t arg)
> +{
> +  uint32_callee (arg);
> +}
> +
> +void
> +uint64_caller (uint64x4_t arg)
> +{
> +  uint64_callee (arg);
> +}
> +
> +/* { dg-final { scan-assembler-times {\tld1b\tz0\.b, p[0-7]/z, \[x0\]} 2 } } 
> */
> +/* { dg-final { scan-assembler-times {\tld1h\tz0\.h, p[0-7]/z, \[x0\]} 3 } } 
> */
> +/* { dg-final { scan-assembler-times {\tld1w\tz0\.s, p[0-7]/z, \[x0\]} 3 } } 
> */
> +/* { dg-final { scan-assembler-times {\tld1d\tz0\.d, p[0-7]/z, \[x0\]} 3 } } 
> */
> +/* { dg-final { scan-assembler-not {\tst1[bhwd]\t} } } */

Reply via email to