Add BUILT_IN_GOACC_KERNELS_INTERNAL (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries tom_devr...@mentor.com wrote:
 I'm submitting a patch series with initial support for the oacc kernels 
 directive.
 
 The patch series uses pass_parallelize_loops to implement parallelization of 
 loops in the oacc kernels region.

Committed to gomp-4_0-branch in r78:

commit fd3add90d38d5f1b38c9cb557404542b6383b2b0
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Tue Apr 21 19:24:57 2015 +

Add BUILT_IN_GOACC_KERNELS_INTERNAL

..., a variant of the GOACC_kernels builtin.  This variant does not call the
function passed as function pointer, and therefore is less of an 
optimization
barrier than the original variant.

The purpose of this variant is to allow the introduction of the 
GOACC_kernels
call before splitting off the region body into a function (something that is
currently done simultaneously).

gcc/
* builtin-attrs.def (DOT_DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
(ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST): Add
DEF_ATTR_TREE_LIST.
* omp-builtins.def (BUILT_IN_GOACC_KERNELS_INTERNAL): Add
DEF_GOACC_BUILTIN_FNSPEC.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@78 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp|6 ++
 gcc/builtin-attrs.def |4 
 gcc/omp-builtins.def  |5 +
 3 files changed, 15 insertions(+)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index b091dd5..7885189 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,11 @@
 2015-04-21  Tom de Vries  t...@codesourcery.com
 
+   * builtin-attrs.def (DOT_DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
+   (ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST): Add
+   DEF_ATTR_TREE_LIST.
+   * omp-builtins.def (BUILT_IN_GOACC_KERNELS_INTERNAL): Add
+   DEF_GOACC_BUILTIN_FNSPEC.
+
* builtins.def (DEF_GOACC_BUILTIN_FNSPEC): Define.
 
 2015-03-21  Tom de Vries  t...@codesourcery.com
diff --git gcc/builtin-attrs.def gcc/builtin-attrs.def
index 1338644..8eca053 100644
--- gcc/builtin-attrs.def
+++ gcc/builtin-attrs.def
@@ -64,6 +64,7 @@ DEF_ATTR_FOR_INT (6)
   DEF_ATTR_TREE_LIST (ATTR_LIST_##ENUM, ATTR_NULL, \
  ATTR_##ENUM, ATTR_NULL)
 DEF_ATTR_FOR_STRING (STR1, 1)
+DEF_ATTR_FOR_STRING (DOT_DOT_DOT_r_r_r, ...rrr)
 #undef DEF_ATTR_FOR_STRING
 
 /* Construct a tree for a list of two integers.  */
@@ -127,6 +128,9 @@ DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LIST, ATTR_PURE,  
\
ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_PURE_NOTHROW_LEAF_LIST, ATTR_PURE,\
ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST, \
+   ATTR_FNSPEC, ATTR_LIST_DOT_DOT_DOT_r_r_r, \
+   ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LIST, ATTR_NORETURN, \
ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_NORETURN_NOTHROW_LEAF_LIST, ATTR_NORETURN,\
diff --git gcc/omp-builtins.def gcc/omp-builtins.def
index 03955c4..cd273f2 100644
--- gcc/omp-builtins.def
+++ gcc/omp-builtins.def
@@ -39,6 +39,11 @@ DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DATA_END, GOACC_data_end,
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_ENTER_EXIT_DATA, GOACC_enter_exit_data,
   BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_INT_INT_VAR,
   ATTR_NOTHROW_LIST)
+DEF_GOACC_BUILTIN_FNSPEC (BUILT_IN_GOACC_KERNELS_INTERNAL,
+ GOACC_kernels_internal,
+ 
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR,
+ ATTR_FNSPEC_DOT_DOT_DOT_r_r_r_NOTHROW_LIST,
+ ATTR_NOTHROW_LIST, ...rrr)
 DEF_GOACC_BUILTIN (BUILT_IN_GOACC_KERNELS, GOACC_kernels,
   
BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR,
   ATTR_NOTHROW_LIST)


Grüße,
 Thomas


signature.asc
Description: PGP signature


Handle global loop counters in c/c++ oacc kernels (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries tom_devr...@mentor.com wrote:
 I'm submitting a patch series with initial support for the oacc kernels 
 directive.

Committed to gomp-4_0-branch in r87:

commit abaf92b2db3c0799edac63cfb846af2dbde47423
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Tue Apr 21 20:27:40 2015 +

Handle global loop counters in c/c++ oacc kernels

gcc/
* passes.def: Add pass_fre after pass_ch_oacc_kernels.

gcc/testsuite/
* c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
* c-c++-common/goacc/kernels-one-counter-var.c: New test.
* g++.dg/ipa/devirt-37.C: Update for new pass_fre.
* g++.dg/ipa/devirt-40.C: Likewise.
* g++.dg/tree-ssa/pr61034.C: Likewise.
* gcc.dg/ipa/ipa-pta-13.c: Likewise.
* gcc.dg/ipa/ipa-pta-3.c: Likewise.
* gcc.dg/ipa/ipa-pta-4.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@87 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |2 +
 gcc/passes.def |1 +
 gcc/testsuite/ChangeLog.gomp   |9 
 .../goacc/kernels-counter-vars-function-scope.c|   55 
 .../c-c++-common/goacc/kernels-one-counter-var.c   |   54 +++
 gcc/testsuite/g++.dg/ipa/devirt-37.C   |   12 ++---
 gcc/testsuite/g++.dg/ipa/devirt-40.C   |6 +--
 gcc/testsuite/g++.dg/tree-ssa/pr61034.C|   10 ++--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-13.c  |6 +--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-3.c   |6 +--
 gcc/testsuite/gcc.dg/ipa/ipa-pta-4.c   |6 +--
 11 files changed, 144 insertions(+), 23 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index f14c3718..b1933ba 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,7 @@
 2015-04-21  Tom de Vries  t...@codesourcery.com
 
+   * passes.def: Add pass_fre after pass_ch_oacc_kernels.
+
* passes.def: Add pass_scev_cprop to pass_oacc_kernels.
* tree-ssa-loop.c (pass_scev_cprop::clone): New function.
 
diff --git gcc/passes.def gcc/passes.def
index 3e85808..04cbba0 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -91,6 +91,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_oacc_kernels);
  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
  NEXT_PASS (pass_ch_oacc_kernels);
+ NEXT_PASS (pass_fre);
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index eed22e2..ed80f5b 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,6 +1,15 @@
 2015-04-21  Tom de Vries  t...@codesourcery.com
Thomas Schwinge  tho...@codesourcery.com
 
+   * c-c++-common/goacc/kernels-counter-vars-function-scope.c: New test.
+   * c-c++-common/goacc/kernels-one-counter-var.c: New test.
+   * g++.dg/ipa/devirt-37.C: Update for new pass_fre.
+   * g++.dg/ipa/devirt-40.C: Likewise.
+   * g++.dg/tree-ssa/pr61034.C: Likewise.
+   * gcc.dg/ipa/ipa-pta-13.c: Likewise.
+   * gcc.dg/ipa/ipa-pta-3.c: Likewise.
+   * gcc.dg/ipa/ipa-pta-4.c: Likewise.
+
* gcc.dg/pr41488.c: Update for new pass_scev_cprop.
* gcc.dg/tree-ssa/loop-17.c: Likewise.
* gcc.dg/tree-ssa/loop-39.c: Likewise.
diff --git 
gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c 
gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
new file mode 100644
index 000..06cdb29
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c
@@ -0,0 +1,55 @@
+/* { dg-additional-options -O2 } */
+/* { dg-additional-options -ftree-parallelize-loops=32 } */
+/* { dg-additional-options -fdump-tree-parloops_oacc_kernels-all } */
+/* { dg-additional-options -fdump-tree-optimized } */
+
+#include stdlib.h
+
+#define N (1024 * 512)
+#define COUNTERTYPE unsigned int
+
+int
+main (void)
+{
+  unsigned int *__restrict a;
+  unsigned int *__restrict b;
+  unsigned int *__restrict c;
+  COUNTERTYPE i;
+  COUNTERTYPE ii;
+
+  a = (unsigned int *)malloc (N * sizeof (unsigned int));
+  b = (unsigned int *)malloc (N * sizeof (unsigned int));
+  c = (unsigned int *)malloc (N * sizeof (unsigned int));
+
+  for (i = 0; i  N; i++)
+a[i] = i * 2;
+
+  for (i = 0; i  N; i++)
+b[i] = i * 4;
+
+#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N])
+  {
+for (ii = 0; ii  N; ii++)
+  c[ii] = a[ii] + b[ii];
+  }
+
+  for (i = 0; i  N; i++)
+if (c[i] != a[i] + b[i])
+  abort ();
+
+  free (a);
+  free (b);
+  free (c);
+
+  return 0;
+}
+
+/* Check that only one loop is analyzed, and that it can be parallelized.  */
+/* { 

Handle oacc kernels with other directives (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries tom_devr...@mentor.com wrote:
 I'm submitting a patch series with initial support for the oacc kernels 
 directive.

Committed to gomp-4_0-branch in r88:

commit 7109b39defb87bc839983339c9fb4cdcb3891238
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Tue Apr 21 20:32:01 2015 +

Handle oacc kernels with other directives

Mark directives with fn spec attributes to prevent them from acting as
optimization barrier.

gcc/
* builtin-attrs.def (DOT_DOT_r_r_r): Add DEF_ATTR_FOR_STRING.
(ATTR_FNSPEC_DOT_DOT_r_r_r_NOTHROW_LIST): Add DEF_ATTR_TREE_LIST.
* omp-builtins.def (BUILT_IN_GOACC_DATA_START)
(BUILT_IN_GOACC_ENTER_EXIT_DATA, BUILT_IN_GOACC_UPDATE): Use
DEF_GOACC_BUILTIN_FNSPEC instead of DEF_GOACC_BUILTIN.

gcc/testsuite/
* c-c++-common/goacc/kernels-loop-data-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit-2.c: New test.
* c-c++-common/goacc/kernels-loop-data-enter-exit.c: New test.
* c-c++-common/goacc/kernels-loop-data-update.c: New test.
* c-c++-common/goacc/kernels-loop-data.c: New test.
* c-c++-common/goacc/kernels-parallel-loop-data-enter-exit.c: New
test.
* gfortran.dg/goacc/kernels-loop-data-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: New test.
* gfortran.dg/goacc/kernels-loop-data-update.f95: New test.
* gfortran.dg/goacc/kernels-loop-data.f95: New test.
* gfortran.dg/goacc/kernels-parallel-loop-data-enter-exit.f95: New
test.

libgomp/
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-2.c: New
test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit-2.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-enter-exit.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data-update.c:
New test.
* testsuite/libgomp.oacc-c-c++-common/kernels-loop-data.c: New
test.
* 
testsuite/libgomp.oacc-c-c++-common/kernels-parallel-loop-data-enter-exit.c:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95: New
test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95:
New test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95: New
test.
* testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: New test.
* 
testsuite/libgomp.oacc-fortran/kernels-parallel-loop-data-enter-exit.f95:
New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@88 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |6 ++
 gcc/builtin-attrs.def  |3 +
 gcc/omp-builtins.def   |   21 +++---
 gcc/testsuite/ChangeLog.gomp   |   15 +
 .../c-c++-common/goacc/kernels-loop-data-2.c   |   71 
 .../goacc/kernels-loop-data-enter-exit-2.c |   69 +++
 .../goacc/kernels-loop-data-enter-exit.c   |   66 ++
 .../c-c++-common/goacc/kernels-loop-data-update.c  |   66 ++
 .../c-c++-common/goacc/kernels-loop-data.c |   65 ++
 .../goacc/kernels-parallel-loop-data-enter-exit.c  |   67 ++
 .../gfortran.dg/goacc/kernels-loop-data-2.f95  |   52 ++
 .../goacc/kernels-loop-data-enter-exit-2.f95   |   52 ++
 .../goacc/kernels-loop-data-enter-exit.f95 |   50 ++
 .../gfortran.dg/goacc/kernels-loop-data-update.f95 |   49 ++
 .../gfortran.dg/goacc/kernels-loop-data.f95|   50 ++
 .../kernels-parallel-loop-data-enter-exit.f95  |   51 ++
 libgomp/ChangeLog.gomp |   24 +++
 .../kernels-loop-data-2.c  |   56 +++
 .../kernels-loop-data-enter-exit-2.c   |   54 +++
 .../kernels-loop-data-enter-exit.c |   51 ++
 .../kernels-loop-data-update.c |   53 +++
 .../libgomp.oacc-c-c++-common/kernels-loop-data.c  |   50 ++
 .../kernels-parallel-loop-data-enter-exit.c|   52 ++
 .../libgomp.oacc-fortran/kernels-loop-data-2.f95   |   38 +++
 .../kernels-loop-data-enter-exit-2.f95 |   38 +++
 .../kernels-loop-data-enter-exit.f95   |   36 ++
 .../kernels-loop-data-update.f95   |   36 ++
 

Handle global loop counters in fortran oacc kernels (was: openacc kernels directive -- initial support)

2015-04-21 Thread Thomas Schwinge
Hi!

On Sat, 15 Nov 2014 13:14:52 +0100, Tom de Vries tom_devr...@mentor.com wrote:
 I'm submitting a patch series with initial support for the oacc kernels 
 directive.

Committed to gomp-4_0-branch in r86:

commit 0c33234340aa17536c2c86e0982c42070c89226b
Author: tschwinge tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4
Date:   Tue Apr 21 20:22:54 2015 +

Handle global loop counters in fortran oacc kernels

Unable to have loop counters with a scope limited to the kernels region, and
the fact that function scope inhibits parallelization, at the technical 
level,
it seems possible to do DCE and get rid of the dead code that is inhibiting
parallelization (in other words, the code copying the loop iterator value 
out
of the region), but probably some effort would be involved.

Another possibility is to add an assign of the final value of the loop
iteration variable after the loop to cut the dependency, though this will 
only
work for loops where that value is know at compile time -- which is exactly
what pass_scev_cprop does.

gcc/
* passes.def: Add pass_scev_cprop to pass_oacc_kernels.
* tree-ssa-loop.c (pass_scev_cprop::clone): New function.

gcc/testsuite/
* gcc.dg/pr41488.c: Update for new pass_scev_cprop.
* gcc.dg/tree-ssa/loop-17.c: Likewise.
* gcc.dg/tree-ssa/loop-39.c: Likewise.
* gcc.dg/tree-ssa/scev-7.c: Likewise.
* gfortran.dg/goacc/kernels-loop-2.f95: New test.
* gfortran.dg/goacc/kernels-loop.f95: New test.

libgomp/
* testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: New test.
* testsuite/libgomp.oacc-fortran/kernels-loop.f95: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@86 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp |3 ++
 gcc/passes.def |1 +
 gcc/testsuite/ChangeLog.gomp   |7 +++
 gcc/testsuite/gcc.dg/pr41488.c |6 +--
 gcc/testsuite/gcc.dg/tree-ssa/loop-17.c|6 +--
 gcc/testsuite/gcc.dg/tree-ssa/loop-39.c|6 +--
 gcc/testsuite/gcc.dg/tree-ssa/scev-7.c |6 +--
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 |   46 
 gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   |   40 +
 gcc/tree-ssa-loop.c|1 +
 libgomp/ChangeLog.gomp |3 ++
 .../libgomp.oacc-fortran/kernels-loop-2.f95|   32 ++
 .../libgomp.oacc-fortran/kernels-loop.f95  |   28 
 13 files changed, 173 insertions(+), 12 deletions(-)

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index bf0ee52..f14c3718 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,5 +1,8 @@
 2015-04-21  Tom de Vries  t...@codesourcery.com
 
+   * passes.def: Add pass_scev_cprop to pass_oacc_kernels.
+   * tree-ssa-loop.c (pass_scev_cprop::clone): New function.
+
* passes.def: Add pass_parallelize_loops_oacc_kernels in pass group
pass_oacc_kernels.
* tree-parloops.c (create_parallel_loop, gen_parallel_loop): Add
diff --git gcc/passes.def gcc/passes.def
index 2d2e286..3e85808 100644
--- gcc/passes.def
+++ gcc/passes.def
@@ -94,6 +94,7 @@ along with GCC; see the file COPYING3.  If not see
  NEXT_PASS (pass_tree_loop_init);
  NEXT_PASS (pass_lim);
  NEXT_PASS (pass_copy_prop);
+ NEXT_PASS (pass_scev_cprop);
  NEXT_PASS (pass_parallelize_loops_oacc_kernels);
  NEXT_PASS (pass_expand_omp_ssa);
  NEXT_PASS (pass_tree_loop_done);
diff --git gcc/testsuite/ChangeLog.gomp gcc/testsuite/ChangeLog.gomp
index 2c6abff..eed22e2 100644
--- gcc/testsuite/ChangeLog.gomp
+++ gcc/testsuite/ChangeLog.gomp
@@ -1,6 +1,13 @@
 2015-04-21  Tom de Vries  t...@codesourcery.com
Thomas Schwinge  tho...@codesourcery.com
 
+   * gcc.dg/pr41488.c: Update for new pass_scev_cprop.
+   * gcc.dg/tree-ssa/loop-17.c: Likewise.
+   * gcc.dg/tree-ssa/loop-39.c: Likewise.
+   * gcc.dg/tree-ssa/scev-7.c: Likewise.
+   * gfortran.dg/goacc/kernels-loop-2.f95: New test.
+   * gfortran.dg/goacc/kernels-loop.f95: New test.
+
* c-c++-common/goacc/kernels-loop-2.c: New test.
* c-c++-common/goacc/kernels-loop.c: New test.
* c-c++-common/goacc/kernels-loop-n.c: New test.
diff --git gcc/testsuite/gcc.dg/pr41488.c gcc/testsuite/gcc.dg/pr41488.c
index c4bc428..1f306b4 100644
--- gcc/testsuite/gcc.dg/pr41488.c
+++ gcc/testsuite/gcc.dg/pr41488.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options -O2 -fdump-tree-sccp-scev } */
+/* { dg-options -O2 -fdump-tree-sccp2-scev } */
 
 struct struct_t
 {
@@ -14,5 +14,5 @@ void foo (struct struct_t* sp, int start, int end)
 sp-data[i+start] = 0;
 }
 

Re: openacc kernels directive -- initial support

2014-11-19 Thread Tom de Vries

On 15-11-14 13:14, Tom de Vries wrote:

  Don't allow flto-partition=balance for fopenacc
Unsubmitted. This works around a compilation problem for
libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c that I ran into on
our internal dev branch.  I'll investigate whether I can reproduce with
gomp-4_0-branch asap.


I managed to reproduce this problem with the gomp-4_0-branch. Filed as: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63979 .


Thanks,
- Tom


openacc kernels directive -- initial support

2014-11-15 Thread Tom de Vries

Hi,

I'm submitting a patch series with initial support for the oacc kernels 
directive.

The patch series uses pass_parallelize_loops to implement parallelization of 
loops in the oacc kernels region.


The patch series consists of these 8 patches:
...
1  Expand oacc kernels after pass_build_ealias
2  Add pass_oacc_kernels
3  Add pass_ch_oacc_kernels to pass_oacc_kernels
4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
5  Add pass_loop_im to pass_oacc_kernels
6  Add pass_ccp to pass_oacc_kernels
7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
8  Do simple omp lowering for no address taken var
...

The patch series does not yet apply cleanly to trunk, since it's dependent on 
the oacc middle end changes present in the gomp-4_0-branch, already submitted by 
Thomas for trunk.


Furthermore, it's dependent on an assert fix submitted for trunk ('Fix 
gcc_assert in expand_omp_for_static_chunk' @ 
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg01149.html ).


The patch series is intended for trunk, but - given the dependency on the oacc 
middle end changes - has been bootstrapped for x86_64 on top of gomp-4_0-branch.


I'll post the patch series in reply to this email.

Thanks,
- Tom

[ FTR  In order to get clean libgomp and goacc test results in gomp-4_0-branch, 
to have a good basis for testing, I used the following patch set:


 Don't allow flto-partition=balance for fopenacc
   Unsubmitted. This works around a compilation problem for
   libgomp/testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c that I ran into on
   our internal dev branch.  I'll investigate whether I can reproduce with
   gomp-4_0-branch asap.

 Mark fopenacc as LTO option
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00085.html   

 Only use nvidia accelerator if present
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00247.html

 Set default LIBGOMP_PLUGIN_PATH
   @ https://gcc.gnu.org/ml/gcc-patches/2014-11/msg00242.html
]