Re: [PATCH] vect: allow using inbranch simdclones for masked loops

2023-11-03 Thread Richard Biener
On Fri, 3 Nov 2023, Andre Vieira (lists) wrote:

> 
> 
> On 03/11/2023 07:31, Richard Biener wrote:
> 
> > 
> > OK.
> > 
> > I do wonder about the gfortran testsuite adjustments though.
> > 
> > !GCC$ builtin (sin) attributes simd (inbranch)
> > 
> >! this should not be using simd clone
> >y4 = sin(x8)
> > 
> > previously we wouldn't vectorize this as no notinbranch simd function
> > is available but now we do since we use the inbranch function for the
> > notinbranch call.  If that's desired then a better modification of
> > the test would be to expect vectorization, no?
> > 
> 
> I was in two minds about this. I interpreted the test to be about the fact
> that sin is overloaded in fortran, given the name of the program 'program
> test_overloaded_intrinsic', and thus I thought it was testing that it calls
> sinf when a real(4) is passed and sin for a real(8) and that simdclones aren't
> used for the wrong overload. That doesn't quite explain why the pragma for
> sin(double) was added in the first place, that wouldn't have been necessary,
> but then again neither are the cos and cosf.
> 
> Happy to put it back in and test that the 'masked' simdclone is used using
> some regexp too.

Looking at when the test was added it was added when supporting
-fpre-include.  So it hardly was a test for our SIMD capabilities
but for having those OMP simd declarations.

Your original change is OK.

Richard.


Re: [PATCH] vect: allow using inbranch simdclones for masked loops

2023-11-03 Thread Andre Vieira (lists)




On 03/11/2023 07:31, Richard Biener wrote:



OK.

I do wonder about the gfortran testsuite adjustments though.

!GCC$ builtin (sin) attributes simd (inbranch)

   ! this should not be using simd clone
   y4 = sin(x8)

previously we wouldn't vectorize this as no notinbranch simd function
is available but now we do since we use the inbranch function for the
notinbranch call.  If that's desired then a better modification of
the test would be to expect vectorization, no?



I was in two minds about this. I interpreted the test to be about the 
fact that sin is overloaded in fortran, given the name of the program 
'program test_overloaded_intrinsic', and thus I thought it was testing 
that it calls sinf when a real(4) is passed and sin for a real(8) and 
that simdclones aren't used for the wrong overload. That doesn't quite 
explain why the pragma for sin(double) was added in the first place, 
that wouldn't have been necessary, but then again neither are the cos 
and cosf.


Happy to put it back in and test that the 'masked' simdclone is used 
using some regexp too.


Re: [PATCH] vect: allow using inbranch simdclones for masked loops

2023-11-03 Thread Richard Biener
On Thu, 2 Nov 2023, Andre Vieira (lists) wrote:

> Hi,
> 
> In a previous patch I did most of the work for this, but forgot to change the
> check for number of arguments matching between call and simdclone.  This check
> should accept calls without a mask to be matched against simdclones with mask
> arguments.  I also added tests to verify this feature actually works.
> 
> 
> For the simd-builtins tests I decided to remove the sin (double) simdclone
> which would now be used, because it was inbranch and we enable their use for
> not inbranch.  Given the nature of the test, removing it made more sense, but
> thats not a strong opinion, happy to change.
> 
> Bootstrapped and regression tested on aarch64-unknown-linux-gnu and
> x86_64-pc-linux-gnu.
> 
> OK for trunk?

OK.

I do wonder about the gfortran testsuite adjustments though.

!GCC$ builtin (sin) attributes simd (inbranch)

  ! this should not be using simd clone
  y4 = sin(x8)

previously we wouldn't vectorize this as no notinbranch simd function
is available but now we do since we use the inbranch function for the
notinbranch call.  If that's desired then a better modification of
the test would be to expect vectorization, no?

Richard.

> PS: I'll be away for two weeks from tomorrow, it would be really nice if this
> can go in for gcc-14, otherwise the previous work I did for this won't have
> any actual visible effect :(
> 
> 
> gcc/ChangeLog:
> 
>   * tree-vect-stmts.cc (vectorizable_simd_clone_call): Allow unmasked
>   calls to use masked simdclones.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-simd-clone-20.c: New file.
>   * gfortran.dg/simd-builtins-1.h: Adapt.
>   * gfortran.dg/simd-builtins-6.f90: Adapt.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] vect: allow using inbranch simdclones for masked loops

2023-11-02 Thread Andre Vieira (lists)

Hi,

In a previous patch I did most of the work for this, but forgot to 
change the check for number of arguments matching between call and 
simdclone.  This check should accept calls without a mask to be matched 
against simdclones with mask arguments.  I also added tests to verify 
this feature actually works.



For the simd-builtins tests I decided to remove the sin (double) 
simdclone which would now be used, because it was inbranch and we enable 
their use for not inbranch.  Given the nature of the test, removing it 
made more sense, but thats not a strong opinion, happy to change.


Bootstrapped and regression tested on aarch64-unknown-linux-gnu and 
x86_64-pc-linux-gnu.


OK for trunk?

PS: I'll be away for two weeks from tomorrow, it would be really nice if 
this can go in for gcc-14, otherwise the previous work I did for this 
won't have any actual visible effect :(



gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_simd_clone_call): Allow unmasked
calls to use masked simdclones.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-simd-clone-20.c: New file.
* gfortran.dg/simd-builtins-1.h: Adapt.
* gfortran.dg/simd-builtins-6.f90: Adapt.diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c 
b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c
new file mode 100644
index 
..9f51a68f3a0c8851af2cd26bd8235c771b851d7d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-20.c
@@ -0,0 +1,87 @@
+/* { dg-require-effective-target vect_simd_clones } */
+/* { dg-additional-options "-fopenmp-simd --param vect-epilogues-nomask=0" } */
+/* { dg-additional-options "-mavx" { target avx_runtime } } */
+
+/* Test that simd inbranch clones work correctly.  */
+
+#ifndef TYPE
+#define TYPE int
+#endif
+
+/* A simple function that will be cloned.  */
+#pragma omp declare simd inbranch
+TYPE __attribute__((noinline))
+foo (TYPE a)
+{
+  return a + 1;
+}
+
+/* Check that "inbranch" clones are called correctly.  */
+
+void __attribute__((noipa))
+masked (TYPE * __restrict a, TYPE * __restrict b, int size)
+{
+  #pragma omp simd
+  for (int i = 0; i < size; i++)
+b[i] = foo(a[i]);
+}
+
+/* Check that "inbranch" works when there might be unrolling.  */
+
+void __attribute__((noipa))
+masked_fixed (TYPE * __restrict a, TYPE * __restrict b)
+{
+  #pragma omp simd
+  for (int i = 0; i < 128; i++)
+b[i] = foo(a[i]);
+}
+
+/* Validate the outputs.  */
+
+void
+check_masked (TYPE *b, int size)
+{
+  for (int i = 0; i < size; i++)
+if (b[i] != (TYPE)(i + 1))
+  {
+   __builtin_printf ("error at %d\n", i);
+   __builtin_exit (1);
+  }
+}
+
+int
+main ()
+{
+  TYPE a[1024];
+  TYPE b[1024];
+
+  for (int i = 0; i < 1024; i++)
+a[i] = i;
+
+  masked_fixed (a, b);
+  check_masked (b, 128);
+
+  /* Test various sizes to cover machines with different vectorization
+ factors.  */
+  for (int size = 8; size <= 1024; size *= 2)
+{
+  masked (a, b, size);
+  check_masked (b, size);
+}
+
+  /* Test sizes that might exercise the partial vector code-path.  */
+  for (int size = 8; size <= 1024; size *= 2)
+{
+  masked (a, b, size-4);
+  check_masked (b, size-4);
+}
+
+  return 0;
+}
+
+/* Ensure the the in-branch simd clones are used on targets that support them. 
 */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" 
{ target { aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" 
{ target { x86_64*-*-* } } } } */
+
+/* The LTO test produces two dump files and we scan the wrong one.  */
+/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */
diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-1.h 
b/gcc/testsuite/gfortran.dg/simd-builtins-1.h
index 
88d555cf41ad065ea525a63d7c05d15d3e5b54ed..08b73514a67d5791d35203530d039741946e9dcc
 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-1.h
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-1.h
@@ -1,4 +1,3 @@
-!GCC$ builtin (sin) attributes simd (inbranch)
 !GCC$ builtin (sinf) attributes simd (notinbranch)
 !GCC$ builtin (cosf) attributes simd
 !GCC$ builtin (cosf) attributes simd (notinbranch)
diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90 
b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
index 
60bcac78f3e0cc492930f3eb73cf97065312dc1c..2c68f9f1818a35674a0aef15793aa312a48199a8
 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
@@ -2,7 +2,6 @@
 ! { dg-additional-options "-nostdinc -Ofast -fdump-tree-optimized" }
 ! { dg-additional-options "-msse2 -mno-avx" { target i?86-*-linux* 
x86_64-*-linux* } }
 
-!GCC$ builtin (sin) attributes simd (inbranch)
 !GCC$ builtin (sinf) attributes simd (notinbranch)
 !GCC$ builtin (cosf) attributes simd
 !GCC$ builtin (cosf) attributes simd (notinbranch)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index