Re: [PATCH] OpenACC documentation for libgomp

2016-01-12 Thread James Norris

Bernd,

On 01/11/2016 11:23 AM, Bernd Schmidt wrote:

On 01/05/2016 04:47 PM, James Norris wrote:

I've updated the original patch after some very helpful
comments from Sandra (thank you, thank you).

OK to commit to trunk?


I'm probably not fully qualified to review the contents either, but few people
are and it looks reasonable enough that I guess I'll just ack it. Before that,
some questions though:


+@item @emph{Fortran}:
+@multitable @columnfractions .20 .80
+@item @emph{Prototype}: @tab @code{function acc_async_test(arg);}
+@item   @tab @code{integer(kind=acc_handle_kind) arg}
+@item   @tab @code{logical acc_async_test}
+@end multitable


I guess this is how Fortran functions and their args/return values are
documented? Do we have other examples of this somewhere?


Yes, in the earlier section that describes OpenMP. One thing
that needs changing is 'Prototype' should be changed to 'Interface'
for Fortran.


+about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
+sections 4.1 and 4.2 of the “The OpenACC
+Application Programming Interface”, Version 2.0, June, 2013.}.


Non-ascii characters. I'm guessing this should probably be some kind of texinfo
@something{} block; OTOH references to C standards in standards.texi just name
them in plain text.


As Jakub pointed out in followup, those instances should
be using a @uref and not double quoted.



I wonder if things like OpenMP and OpenACC should be mentioned in
standards.texi, but that is tangential to this patch.



That's a good idea. Thanks!

Thanks for taking the time for the review.

Jim





Re: [PATCH] OpenACC documentation for libgomp

2016-01-12 Thread James Norris

Hi!

On 01/11/2016 11:35 AM, Jakub Jelinek wrote:

On Tue, Jan 05, 2016 at 09:47:59AM -0600, James Norris wrote:

I've updated the original patch after some very helpful
comments from Sandra (thank you, thank you).


I'd prefer if OpenMP
* Enabling OpenMP::How to enable OpenMP for your applications.
* Runtime Library Routines::   The OpenMP runtime application programming
interface.
* Environment Variables::  Influencing runtime behavior with environment
variables.
chapters precede the OpenACC chapters, most libgomp users are not really
using any offloading, which is new, but using OpenMP for host
parallelization, and only far fewer users are actually trying some
acceleration, whether OpenACC or OpenMP offloading parts.


OpenACC content has been moved after the OpenMP content.



As Bernd found, there are some UTF-8 quotes or what in the patch, those
need to be replaced by some texinfo markup, say


+sections 4.1 and 4.2 of the ???The OpenACC
+Application Programming Interface???, Version 2.0, June, 2013.}.


@uref{http://www.openacc.org/, OpenACC Application Programming Interface, 
Version 2.0, June, 2013}
or something similar.


Those were double quotes and have been changed to @uref's.

Patch commited to trunk

Thanks for taking time for the review.

Jim


Index: ChangeLog
===
--- ChangeLog	(revision 232278)
+++ ChangeLog	(working copy)
@@ -1,3 +1,7 @@
+2016-01-12  James Norris  
+
+	* libgomp.texi: Updates for OpenACC.
+
 2016-01-11  Alexander Monakov  
 
 	* plugin/plugin-nvptx.c (link_ptx): Do not set CU_JIT_TARGET.
Index: libgomp.texi
===
--- libgomp.texi	(revision 232278)
+++ libgomp.texi	(working copy)
@@ -99,6 +99,16 @@
interface.
 * Environment Variables::  Influencing runtime behavior with environment 
variables.
+* Enabling OpenACC::   How to enable OpenACC for your
+   applications.
+* OpenACC Runtime Library Routines:: The OpenACC runtime application
+   programming interface.
+* OpenACC Environment Variables:: Influencing OpenACC runtime behavior with
+   environment variables.
+* CUDA Streams Usage:: Notes on the implementation of
+   asynchronous operations.
+* OpenACC Library Interoperability:: OpenACC library interoperability with the
+   NVIDIA CUBLAS library.
 * The libgomp ABI::Notes on the external ABI presented by libgomp.
 * Reporting Bugs:: How to report bugs in the GNU Offloading and
Multi Processing Runtime Library.
@@ -1790,6 +1800,1272 @@
 
 
 @c -
+@c Enabling OpenACC
+@c -
+
+@node Enabling OpenACC
+@chapter Enabling OpenACC
+
+To activate the OpenACC extensions for C/C++ and Fortran, the compile-time 
+flag @option{-fopenacc} must be specified.  This enables the OpenACC directive
+@code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
+@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
+@code{!$} conditional compilation sentinels in free form and @code{c$},
+@code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
+arranges for automatic linking of the OpenACC runtime library 
+(@ref{OpenACC Runtime Library Routines}).
+
+A complete description of all OpenACC directives accepted may be found in 
+the @uref{http://www.openacc.org/, OpenACC} Application Programming
+Interface manual, version 2.0.
+
+Note that this is an experimental feature and subject to
+change in future versions of GCC.  See
+@uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
+
+
+
+@c -
+@c OpenACC Runtime Library Routines
+@c -
+
+@node OpenACC Runtime Library Routines
+@chapter OpenACC Runtime Library Routines
+
+The runtime routines described here are defined by section 3 of the OpenACC
+specifications in version 2.0.
+They have C linkage, and do not throw exceptions.
+Generally, they are available only for the host, with the exception of
+@code{acc_on_device}, which is available for both the host and the
+acceleration device.
+
+@menu
+* acc_get_num_devices:: Get number of devices for the given device
+type.
+* acc_set_device_type:: Set type of device accelerator to use.
+* acc_get_device_type:: Get type of device accelerator to be used.
+* acc_set_device_num::  Set device number 

Re: [PATCH] OpenACC documentation for libgomp

2016-01-11 Thread Bernd Schmidt

On 01/05/2016 04:47 PM, James Norris wrote:

I've updated the original patch after some very helpful
comments from Sandra (thank you, thank you).

OK to commit to trunk?


I'm probably not fully qualified to review the contents either, but few 
people are and it looks reasonable enough that I guess I'll just ack it. 
Before that, some questions though:



+@item @emph{Fortran}:
+@multitable @columnfractions .20 .80
+@item @emph{Prototype}: @tab @code{function acc_async_test(arg);}
+@item   @tab @code{integer(kind=acc_handle_kind) arg}
+@item   @tab @code{logical acc_async_test}
+@end multitable


I guess this is how Fortran functions and their args/return values are 
documented? Do we have other examples of this somewhere? I've Cc'ed Paul 
Thomas at random as one of the Fortran maintainers for input on whether 
this is a good way to document things.



+about @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM} can be found in
+sections 4.1 and 4.2 of the “The OpenACC
+Application Programming Interface”, Version 2.0, June, 2013.}.


Non-ascii characters. I'm guessing this should probably be some kind of 
texinfo @something{} block; OTOH references to C standards in 
standards.texi just name them in plain text.


I wonder if things like OpenMP and OpenACC should be mentioned in 
standards.texi, but that is tangential to this patch.



Bernd


Re: [PATCH] OpenACC documentation for libgomp

2016-01-11 Thread Jakub Jelinek
On Tue, Jan 05, 2016 at 09:47:59AM -0600, James Norris wrote:
> I've updated the original patch after some very helpful
> comments from Sandra (thank you, thank you).

I'd prefer if OpenMP
* Enabling OpenMP::How to enable OpenMP for your applications.
* Runtime Library Routines::   The OpenMP runtime application programming
   interface.
* Environment Variables::  Influencing runtime behavior with environment
   variables.
chapters precede the OpenACC chapters, most libgomp users are not really
using any offloading, which is new, but using OpenMP for host
parallelization, and only far fewer users are actually trying some
acceleration, whether OpenACC or OpenMP offloading parts.

As Bernd found, there are some UTF-8 quotes or what in the patch, those
need to be replaced by some texinfo markup, say

> +sections 4.1 and 4.2 of the ???The OpenACC
> +Application Programming Interface???, Version 2.0, June, 2013.}.

@uref{http://www.openacc.org/, OpenACC Application Programming Interface, 
Version 2.0, June, 2013}
or something similar.

Otherwise LGTM.

Jakub


Re: [PATCH] OpenACC documentation for libgomp

2016-01-10 Thread Sandra Loosemore

On 01/05/2016 08:47 AM, James Norris wrote:

Hi!

I've updated the original patch after some very helpful
comments from Sandra (thank you, thank you).

OK to commit to trunk?


I'm assuming this is now waiting for technical review?  I can give it 
another read-through for tech-writing issues but I don't feel competent 
to approve the content.


-Sandra



Re: [PATCH] OpenACC documentation for libgomp

2016-01-05 Thread James Norris

Hi!

I've updated the original patch after some very helpful
comments from Sandra (thank you, thank you).

OK to commit to trunk?

Thanks!
Jim

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 87ec337..fc7b9fe 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,8 @@
+2016-01-XX  James Norris  
+	Thomas Schwinge  
+
+	* libgomp.texi (CUDA Streams Usage): New chapter.
+
 2016-01-04  Jakub Jelinek  
 
 	* libgomp.texi: Bump @copying's copyright year.
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 480353a..6c421c3 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -94,6 +94,16 @@ changed to GNU Offloading and Multi Processing Runtime Library.
 @comment  better formatting.
 @comment
 @menu
+* Enabling OpenACC::   How to enable OpenACC for your
+   applications.
+* OpenACC Runtime Library Routines:: The OpenACC runtime application
+   programming interface.
+* OpenACC Environment Variables::Influencing OpenACC runtime behavior with
+   environment variables.
+* CUDA Streams Usage:: Notes on the implementation of
+   asynchronous operations.
+* OpenACC Library Interoperability:: OpenACC library interoperability with the
+   NVIDIA CUBLAS library.
 * Enabling OpenMP::How to enable OpenMP for your applications.
 * Runtime Library Routines::   The OpenMP runtime application programming 
interface.
@@ -113,6 +123,1255 @@ changed to GNU Offloading and Multi Processing Runtime Library.
 
 
 @c -
+@c Enabling OpenACC
+@c -
+
+@node Enabling OpenACC
+@chapter Enabling OpenACC
+
+To activate the OpenACC extensions for C/C++ and Fortran, the compile-time 
+flag @option{-fopenacc} must be specified.  This enables the OpenACC directive
+@code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
+@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
+@code{!$} conditional compilation sentinels in free form and @code{c$},
+@code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
+arranges for automatic linking of the OpenACC runtime library 
+(@ref{OpenACC Runtime Library Routines}).
+
+A complete description of all OpenACC directives accepted may be found in 
+the @uref{http://www.openacc.org/, OpenMP Application Programming
+Interface} manual, version 2.0.
+
+Note that this is an experimental feature, incomplete, and subject to
+change in future versions of GCC.  See
+@uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
+
+
+
+@c -
+@c OpenACC Runtime Library Routines
+@c -
+
+@node OpenACC Runtime Library Routines
+@chapter OpenACC Runtime Library Routines
+
+The runtime routines described here are defined by section 3 of the OpenACC
+specifications in version 2.0.
+They have C linkage, and do not throw exceptions.
+Generally, they are available only for the host, with the exception of
+@code{acc_on_device}, which is available for both the host and the
+acceleration device.
+
+@menu
+* acc_get_num_devices:: Get number of devices for the given device type.
+* acc_set_device_type:: Set type of device accelerator to use.
+* acc_get_device_type:: Get type of device accelerator to be used.
+* acc_set_device_num::  Set device number to use.
+* acc_get_device_num::  Get device number to be used.
+* acc_async_test::  Tests for completion of a specific asynchronous operation.
+* acc_async_test_all::  Tests for completion of all asychronous operations.
+* acc_wait::Wait for completion of a specific asynchronous operation.
+* acc_wait_all::Waits for completion of all asyncrhonous operations.
+* acc_wait_all_async::  Wait for completion of all asynchronous operations.
+* acc_wait_async::  Wait for completion of asynchronous operations.
+* acc_init::Initialize runtime for a specific device type.
+* acc_shutdown::Shuts down the runtime for a specific device type.
+* acc_on_device::   Whether executing on a particular device
+* acc_malloc::  Allocate device memory.
+* acc_free::Free device memory.
+* acc_copyin::  Allocate device memory and copy host memory to it.
+* acc_present_or_copyin::   If the data is not present on the device, allocate device memory and copy from host memory.
+* acc_create::  Allocate device memory and map it to host memory.
+* 

Re: [PATCH] OpenACC documentation for libgomp

2015-12-17 Thread Sandra Loosemore

On 12/16/2015 06:29 AM, James Norris wrote:

Hi,

Attached is the patch to add OpenACC documentation for libgomp.

Ok to commit to trunk?


I have some copy-editing nits.  I can't say I'm familiar enough with 
this functionality to comment intelligently on the content, though



+To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
+flag @command{-fopenacc} must be specified.  This enables the OpenACC directive


s/@command/@option


+@node acc_get_num_devices
+@section @code{acc_get_num_devices} -- Get number of devices for given device 
type
+@table @asis
+@item @emph{Description}
+This routine returns a value indicating the
+number of devices available for the given device type.  It determines
+the number of devices in a @emph{passive} manner.  In other words, it
+does not alter the state within the runtime environment aside from
+possibly initializing an uninitialized device.  This aspect allows


s/aspect //


+the routine to be called without concern for altering the interaction
+with an attached accelerator device.


I think "...concern that it might alter" is what you intend to say 
here.


I'm not too sure about the formatting style here.  It does seem to be 
consistent with the style of the existing content of the manual to have 
a separate section for each function instead of listing them in a table, 
but the existing docs have prototypes that are missing from your 
additions, and I'd really like to see index entries for all these things



+@node acc_on_device
+@section @code{acc_on_device} -- Whether executing on a particular device
+@table @asis
+@item @emph{Description}:
+This routine tells the program whether it is executing on a particular
+device.  Based on the argument passed, GCC tries to evaluate this to a
+constant at compile time, but library functions are also provided, for


s/, for/ for/


+@node CUDA Streams Usage
+@chapter CUDA Streams Usage
+
+This applies to the @code{nvptx} plugin only.
+
+The library provides elements that perform asynchronous movement of
+data and asynchronous operation of computing constructs.  This
+asynchronous functionality is implemented by making use of CUDA
+streams@footnote{See "Stream Management" in "CUDA Driver API",
+TRM-06703-001, Version 5.5, July 2013, for additional information}.
+
+The primary means by which the asychronous functionality is accessed
+is through the use of those OpenACC directives which make use of the


s/which/that/


+@code{async} and @code{wait} clauses.  When the @code{async} clause is
+first used with a directive, it will create a CUDA stream.  If an


s/will create/creates/


+@code{async-argument} is used with the @code{async} clause, then the
+stream will be associated with the specified @code{async-argument}.


s/will be/is/


+
+Following the creation of an association between a CUDA stream and the
+@code{async-argument} of an @code{async} clause, both the @code{wait}
+clause and the @code{wait} directive can be used.  When either the
+clause or directive is used after stream creation, it creates a
+rendezvous point whereby execution will wait until all operations


s/will wait/waits/


+associated with the @code{async-argument}, that is, stream, have
+completed.
+
+Normally, the management of the streams that are created as a result of
+using the @code{async} clause, is done without any intervention by the
+caller.  This implies the association between the @code{async-argument}


You've got an unnecessary comma there.  I think this would be easier to 
parse if rewritten "Normally, streams that are created as a result of 
using the @code{async} clause are managed without any intervention by 
the caller."



+and the CUDA stream will be maintained for the lifetime of the program.


s/will be/is/


+However, this association can be changed through the use of the library
+function @code{acc_set_cuda_stream}.  When the function
+@code{acc_set_cuda_stream} is used, the CUDA stream that was


s/is used/is called/ ??


+originally associated with the @code{async} clause will be destroyed.


s/will be/is/


+Caution should be taken when changing the association as subsequent
+references to the @code{async-argument} will be referring to a different


s/will be referring/refer/


+As the OpenACC library is built using the CUDA Driver API, the question has
+arisen on what impact does using the OpenACC library have on a program that
+uses the Runtime library, or a library based on the Runtime library, e.g.,
+CUBLAS@footnote{See section 2.26, "Interactions with the CUDA Driver API" in
+"CUDA Runtime API", Version 5.5, July 2013 and section 2.27, "VDPAU
+Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
+July 2013, for additional information on library interoperability.}.


This is really hard to parse.  Can we say something like

The OpenACC library uses the CUDA Driver API, and may interact with 
programs that use the Runtime library directly, or another library based 
on 

[PATCH] OpenACC documentation for libgomp

2015-12-16 Thread James Norris

Hi,

Attached is the patch to add OpenACC documentation for libgomp.

Ok to commit to trunk?

Thanks!
Jim
Index: libgomp.texi
===
--- libgomp.texi	(revision 231662)
+++ libgomp.texi	(working copy)
@@ -94,10 +94,25 @@
 @comment  better formatting.
 @comment
 @menu
+* Enabling OpenACC::   How to enable OpenACC for your
+   applications.
+* OpenACC Runtime Library Routines::
+   The OpenACC runtime application
+   programming interface.
+* OpenACC Environment Variables::
+   Influencing OpenACC runtime behavior with
+   environment variables.
+* CUDA Streams Usage:: Notes on the implementation of
+   asynchronous operations.
+* OpenACC Library Interoperability::
+   OpenACC library interoperability with the
+   NVIDIA CUBLAS library.
 * Enabling OpenMP::How to enable OpenMP for your applications.
-* Runtime Library Routines::   The OpenMP runtime application programming 
+* OpenMP Runtime Library Routines::
+   The OpenMP runtime application programming 
interface.
-* Environment Variables::  Influencing runtime behavior with environment 
+* OpenMP Environment Variables::
+   Influencing runtime behavior with environment 
variables.
 * The libgomp ABI::Notes on the external ABI presented by libgomp.
 * Reporting Bugs:: How to report bugs in the GNU Offloading and
@@ -113,6 +128,643 @@
 
 
 @c -
+@c Enabling OpenACC
+@c -
+
+@node Enabling OpenACC
+@chapter Enabling OpenACC
+
+To activate the OpenACC extensions for C/C++ and Fortran, the compile-time 
+flag @command{-fopenacc} must be specified.  This enables the OpenACC directive
+@code{#pragma acc} in C/C++ and @code{!$accp} directives in free form,
+@code{c$acc}, @code{*$acc} and @code{!$acc} directives in fixed form,
+@code{!$} conditional compilation sentinels in free form and @code{c$},
+@code{*$} and @code{!$} sentinels in fixed form, for Fortran.  The flag also
+arranges for automatic linking of the OpenACC runtime library 
+(@ref{OpenACC Runtime Library Routines}).
+
+A complete description of all OpenACC directives accepted may be found in 
+the @uref{http://www.openacc.org/, OpenMP Application Programming
+Interface} manual, version 2.0.
+
+Note that this is an experimental feature, incomplete, and subject to
+change in future versions of GCC.  See
+@uref{https://gcc.gnu.org/wiki/OpenACC} for more information.
+
+
+
+@c -
+@c OpenACC Runtime Library Routines
+@c -
+
+@node OpenACC Runtime Library Routines
+@chapter OpenACC Runtime Library Routines
+
+The runtime routines described here are defined by section 3 of the OpenACC
+specifications in version 2.0.
+They have C linkage, and do not throw exceptions.
+Generally, they are available only for the host, with the exception of
+@code{acc_on_device}, which is available for both the host and the
+acceleration device.
+
+@menu
+* acc_get_num_devices:: Get number of devices for the given device type
+* acc_set_device_type::
+* acc_get_device_type::
+* acc_set_device_num::
+* acc_get_device_num::
+* acc_init::
+* acc_shutdown::
+* acc_on_device::   Whether executing on a particular device
+* acc_malloc::
+* acc_free::
+* acc_copyin::
+* acc_present_or_copyin::
+* acc_create::
+* acc_present_or_create::
+* acc_copyout::
+* acc_delete::
+* acc_update_device::
+* acc_update_self::
+* acc_map_data::
+* acc_unmap_data::
+* acc_deviceptr::
+* acc_hostptr::
+* acc_is_present::
+* acc_memcpy_to_device::
+* acc_memcpy_from_device::
+
+API routines for target platforms.
+
+* acc_get_current_cuda_device::
+* acc_get_current_cuda_context::
+* acc_get_cuda_stream::
+* acc_set_cuda_stream::
+@end menu
+
+
+
+@node acc_get_num_devices
+@section @code{acc_get_num_devices} -- Get number of devices for given device type
+@table @asis
+@item @emph{Description}
+This routine returns a value indicating the
+number of devices available for the given device type.  It determines
+the number of devices in a @emph{passive} manner.  In other words, it
+does not alter the state within the runtime environment aside from
+possibly initializing an uninitialized device.  This aspect allows
+the routine to be called without concern for altering the interaction
+with an attached accelerator device.
+
+@item @emph{Reference}:
+@uref{http://www.openacc.org/, OpenACC specification v2.0}, section