Re: Document OpenACC status for GCC 6

2016-09-20 Thread Gerald Pfeifer
[ Old e-mail alert ]

On Fri, 22 Apr 2016, Thomas Schwinge wrote:
> Thanks for the review; OK to commit as follows?  And then, should
> something be added to the "News" section on 
> itself, too?  (I don't know the policy for that.  We didn't suggest that
> for GCC 5, because at that time we described the support as a
> "preliminary implementation of the OpenACC 2.0a specification"; now 
> it's much more complete and usable.)

Yes, definitely.  Sorry for not picking this up earlier, but this
definitely is a strong News item.  If you feel that particular item
is not suitable any longer, perhaps there is another one (or there
are other ones, plural)?

As a rule of thumb, when in doubt, propose a News item.  As a group 
we have been way too conservative on that front.

Gerald


Re: Document OpenACC status for GCC 6

2016-04-25 Thread Jakub Jelinek
On Fri, Apr 22, 2016 at 11:26:11AM +0200, Thomas Schwinge wrote:
> Index: htdocs/gcc-6/changes.html
> ===
> RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
> retrieving revision 1.75
> diff -u -p -r1.75 changes.html

LGTM.

> --- htdocs/gcc-6/changes.html 21 Apr 2016 15:57:43 -  1.75
> +++ htdocs/gcc-6/changes.html 22 Apr 2016 09:22:19 -
> @@ -124,6 +124,52 @@ For more information, see the
>  
>  New Languages and Language specific improvements
>  
> +Compared to GCC 5, the GCC 6 release series includes a much 
> improved
> +implementation of the http://www.openacc.org/;>OpenACC 2.0a
> +  specification.  Highlights are:
> +
> +  In addition to single-threaded host-fallback execution, offloading 
> is
> + supported for nvptx (Nvidia GPUs) on x86_64 and PowerPC 64-bit
> + little-endian GNU/Linux host systems.  For nvptx offloading, with the
> + OpenACC parallel construct, the execution model allows for an arbitrary
> + number of gangs, up to 32 workers, and 32 vectors.
> +  Initial support for parallelized execution of OpenACC kernels
> + constructs:
> + 
> +   Parallelization of a kernels region is switched on
> + by -fopenacc combined with -O2 or
> + higher.
> +   Code is offloaded onto multiple gangs, but executes with just one
> + worker, and a vector length of 1.
> +   Directives inside a kernels region are not supported.
> +   Loops with reductions can be parallelized.
> +   Only kernels regions with one loop nest are parallelized.
> +   Only the outer-most loop of a loop nest can be parallelized.
> +   Loop nests containing sibling loops are not parallelized.
> + 
> + Typically, using the OpenACC parallel construct gives much better
> + performance, compared to the initial support of the OpenACC kernels
> + construct.
> +  The device_type clause is not supported.
> + The bind and nohost clauses are not
> + supported.  The host_data directive is not supported in
> + Fortran.
> +  Nested parallelism (cf. CUDA dynamic parallelism) is not
> + supported.
> +  Usage of OpenACC constructs inside multithreaded contexts (such as
> + created by OpenMP, or pthread programming) is not supported.
> +  If a call to the acc_on_device function has a
> + compile-time constant argument, the function call evaluates to a
> + compile-time constant value only for C and C++ but not for
> + Fortran.
> +
> +See the https://gcc.gnu.org/wiki/OpenACC;>OpenACC
> +and https://gcc.gnu.org/wiki/Offloading;>Offloading wiki 
> pages
> +for further information.
> +  
> +
>  
>  
>  C family

Jakub


Re: Document OpenACC status for GCC 6

2016-04-24 Thread Sandra Loosemore

On 04/22/2016 03:26 AM, Thomas Schwinge wrote:


Thanks for the review; OK to commit as follows?  And then, should
something be added to the "News" section on 
itself, too?  (I don't know the policy for that.  We didn't suggest that
for GCC 5, because at that time we described the support as a
"preliminary implementation of the OpenACC 2.0a specification"; now it's
much more complete and usable.)


I think the new patch is acceptable for release notes, but TBH I don't 
know what the policy is for updating "News", either.  :-S


-Sandra



Re: Document OpenACC status for GCC 6

2016-04-22 Thread Thomas Schwinge
Hi!

On Thu, 21 Apr 2016 12:19:31 -0600, Sandra Loosemore  
wrote:
> On 04/21/2016 10:21 AM, Thomas Schwinge wrote:
> > + Code will be offloaded onto multiple gangs, but executes with
> > +   just one worker, and a vector length of 1.
> 
> "will be" (future) vs "executes" (present).  Assuming this is all 
> supposed to describe current behavior, please write consistently in the 
> present tense.

Thanks for that.  I keep getting that wrong...

> My only comment on the rest of the patch is that "a kernels region" 
> sounds like a mistake but I think that is the official terminology?

Correct: it's an "OpenACC kernels construct/directive/region".

> -Sandra the nit-picky

Thanks for the review; OK to commit as follows?  And then, should
something be added to the "News" section on 
itself, too?  (I don't know the policy for that.  We didn't suggest that
for GCC 5, because at that time we described the support as a
"preliminary implementation of the OpenACC 2.0a specification"; now it's
much more complete and usable.)

Index: htdocs/gcc-6/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/changes.html,v
retrieving revision 1.75
diff -u -p -r1.75 changes.html
--- htdocs/gcc-6/changes.html   21 Apr 2016 15:57:43 -  1.75
+++ htdocs/gcc-6/changes.html   22 Apr 2016 09:22:19 -
@@ -124,6 +124,52 @@ For more information, see the
 
 New Languages and Language specific improvements
 
+Compared to GCC 5, the GCC 6 release series includes a much improved
+implementation of the http://www.openacc.org/;>OpenACC 2.0a
+  specification.  Highlights are:
+
+  In addition to single-threaded host-fallback execution, offloading is
+   supported for nvptx (Nvidia GPUs) on x86_64 and PowerPC 64-bit
+   little-endian GNU/Linux host systems.  For nvptx offloading, with the
+   OpenACC parallel construct, the execution model allows for an arbitrary
+   number of gangs, up to 32 workers, and 32 vectors.
+  Initial support for parallelized execution of OpenACC kernels
+   constructs:
+   
+ Parallelization of a kernels region is switched on
+   by -fopenacc combined with -O2 or
+   higher.
+ Code is offloaded onto multiple gangs, but executes with just one
+   worker, and a vector length of 1.
+ Directives inside a kernels region are not supported.
+ Loops with reductions can be parallelized.
+ Only kernels regions with one loop nest are parallelized.
+ Only the outer-most loop of a loop nest can be parallelized.
+ Loop nests containing sibling loops are not parallelized.
+   
+   Typically, using the OpenACC parallel construct gives much better
+   performance, compared to the initial support of the OpenACC kernels
+   construct.
+  The device_type clause is not supported.
+   The bind and nohost clauses are not
+   supported.  The host_data directive is not supported in
+   Fortran.
+  Nested parallelism (cf. CUDA dynamic parallelism) is not
+   supported.
+  Usage of OpenACC constructs inside multithreaded contexts (such as
+   created by OpenMP, or pthread programming) is not supported.
+  If a call to the acc_on_device function has a
+   compile-time constant argument, the function call evaluates to a
+   compile-time constant value only for C and C++ but not for
+   Fortran.
+
+See the https://gcc.gnu.org/wiki/OpenACC;>OpenACC
+and https://gcc.gnu.org/wiki/Offloading;>Offloading wiki pages
+for further information.
+  
+
 
 
 C family


Grüße
 Thomas


Re: Document OpenACC status for GCC 6

2016-04-21 Thread Sandra Loosemore

On 04/21/2016 10:21 AM, Thomas Schwinge wrote:


+ Code will be offloaded onto multiple gangs, but executes with
+   just one worker, and a vector length of 1.


"will be" (future) vs "executes" (present).  Assuming this is all 
supposed to describe current behavior, please write consistently in the 
present tense.



+   Typically, using the OpenACC parallel construct will give much better
+   performance, compared to the initial support of the OpenACC kernels
+   construct.


Here too.

My only comment on the rest of the patch is that "a kernels region" 
sounds like a mistake but I think that is the official terminology?


-Sandra the nit-picky