Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

Andre Vieira (lists) via Gcc-patches Thu, 20 Apr 2023 08:23:07 -0700



On 20/04/2023 15:51, Richard Sandiford wrote:

"Andre Vieira (lists)" <andre.simoesdiasvie...@arm.com> writes:

Hi all,

This is a series of patches/RFCs to implement support in GCC to be able
to target AArch64's libmvec functions that will be/are being added to glibc.
We have chosen to use the omp pragma '#pragma omp declare variant ...'
with a simd construct as the way for glibc to inform GCC what functions
are available.

For example, if we would like to supply a vector version of the scalar
'cosf' we would have an include file with something like:
typedef __attribute__((__neon_vector_type__(4))) float __f32x4_t;
typedef __attribute__((__neon_vector_type__(2))) float __f32x2_t;
typedef __SVFloat32_t __sv_f32_t;
typedef __SVBool_t __sv_bool_t;
__f32x4_t _ZGVnN4v_cosf (__f32x4_t);
__f32x2_t _ZGVnN2v_cosf (__f32x2_t);
__sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t);
#pragma omp declare variant(_ZGVnN4v_cosf) \
      match(construct = {simd(notinbranch, simdlen(4))}, device =
{isa("simd")})
#pragma omp declare variant(_ZGVnN2v_cosf) \
      match(construct = {simd(notinbranch, simdlen(2))}, device =
{isa("simd")})
#pragma omp declare variant(_ZGVsMxv_cosf) \
      match(construct = {simd(inbranch)}, device = {isa("sve")})
extern float cosf (float);

The BETA ABI can be found in the vfabia64 subdir of
https://github.com/ARM-software/abi-aa/
This currently disagrees with how this patch series implements 'omp
declare simd' for SVE and I also do not see a need for the 'omp declare
variant' scalable extension constructs. I will make changes to the ABI
once we've finalized the co-design of the ABI and this implementation.


I don't see a good reason for dropping the extension("scalable").
The problem is that since the base spec requires a simdlen clause,
GCC should in general raise an error if simdlen is omitted.

Where can you find this in the specs? I tried to find it but couldn't.

Leaving out simdlen in a 'omp declare simd' I assume is OK, our vectorABI defines behaviour for this. But I couldn't find what it meant for aomp declare variant, obviously can't be the same as for declare simd, asthat is defined to mean 'define a set of clones' and only one clone canbe associated to a declare variant.


But I'm not sure it makes sense to ignore -msve-vector-bits= when
compiling the SVE version (which is what patch 4 seems to do).
If someone compiles with -march=armv8.4-a, we'll use all Armv8.4-A
features in the Advanced SIMD routines.  Why should we ignore
SVE-related target information for the SVE routines?

Not sure I understand what you mean. The vector ABI defines that if asimdlen is omitted that (other than the NEON clones) a SVE VLA clone isavailable. So how would I take -msve-vector-bits into consideration? Doyou mean I ought to add them as options to pass to the function so thatit gets used when doing the codegen for the clone (if a function body isavailable)?

This is where things get a bit iffy for me though... We purposefullygenerate a SVE simdclone regardless of command-line options, just likex86 does, so why would these options affect simd clone generation butnot the actual availability of SVE? Just seems a bit odd...

A viable alternative would be to rely on declare variant for suchbehaviour, where we could use function attributes to pass specifictarget options to the variant's prototype to be able to add morespecific tuning options per variant. Not sure it will work but I cantry it with my rebased patches at some point. I have to admit though, itis not a feature we are looking to use, so not sure it's worth theeffort. The SVE simdclone codegen (with function bodies) is alreadypretty bad, so if we do believe there is a usecase for these, that mightbe something we should focus on before this sort of more specific tuning.


Of course, the fact that we take command-line options into account
means that omp simd/variant clauses on linkonce/comdat group functions
are an ODR violation waiting to happen.  But the same is true for the
original scalar functions that the clauses are attached to.

Can't find proper definitions of linkonce/comdat group functions socan't comment.


Thanks,
Richard

The patch series has three main steps:
1) Add SVE support for 'omp declare simd', see PR 96342
2) Enable GCC to use omp declare variants with simd constructs as simd
clones during auto-vectorization.
3) Add SLP support for vectorizable_simd_clone_call (This sounded like a
nice thing to add as we want to move away from non-slp vectorization).

Below you can see the list of current Patches/RFCs, the difference being
on how confident I am of the proposed changes. For the RFC I am hoping
to get early comments on the approach, rather than more indepth
code-reviews.

I appreciate we are still in Stage 4, so I can completely understand if
you don't have time to review this now, but I thought it can't hurt to
post these early.

Andre Vieira:
[PATCH] omp: Replace simd_clone_subparts with TYPE_VECTOR_SUBPARTS
[PATCH] parloops: Copy target and optimizations when creating a function
clone
[PATCH] parloops: Allow poly nit and bound
[RFC] omp, aarch64: Add SVE support for 'omp declare simd' [PR 96342]
[RFC] omp: Create simd clones from 'omp declare variant's
[RFC] omp: Allow creation of simd clones from omp declare variant with
-fopenmp-simd flag

Work in progress:
[RFC] vect: Enable SLP codegen for vectorizable_simd_clone_call

Re: [RFC 0/X] Implement GCC support for AArch64 libmvec

Reply via email to