On 12/16/2015 06:29 AM, James Norris wrote:
Hi,

Attached is the patch to add OpenACC documentation for libgomp.

Ok to commit to trunk?

I have some copy-editing nits. I can't say I'm familiar enough with this functionality to comment intelligently on the content, though....

+To activate the OpenACC extensions for C/C++ and Fortran, the compile-time
+flag @command{-fopenacc} must be specified.  This enables the OpenACC directive

s/@command/@option

+@node acc_get_num_devices
+@section @code{acc_get_num_devices} -- Get number of devices for given device 
type
+@table @asis
+@item @emph{Description}
+This routine returns a value indicating the
+number of devices available for the given device type.  It determines
+the number of devices in a @emph{passive} manner.  In other words, it
+does not alter the state within the runtime environment aside from
+possibly initializing an uninitialized device.  This aspect allows

s/aspect //

+the routine to be called without concern for altering the interaction
+with an attached accelerator device.

I think "...concern that it might alter...." is what you intend to say here.

I'm not too sure about the formatting style here. It does seem to be consistent with the style of the existing content of the manual to have a separate section for each function instead of listing them in a table, but the existing docs have prototypes that are missing from your additions, and I'd really like to see index entries for all these things....

+@node acc_on_device
+@section @code{acc_on_device} -- Whether executing on a particular device
+@table @asis
+@item @emph{Description}:
+This routine tells the program whether it is executing on a particular
+device.  Based on the argument passed, GCC tries to evaluate this to a
+constant at compile time, but library functions are also provided, for

s/, for/ for/

+@node CUDA Streams Usage
+@chapter CUDA Streams Usage
+
+This applies to the @code{nvptx} plugin only.
+
+The library provides elements that perform asynchronous movement of
+data and asynchronous operation of computing constructs.  This
+asynchronous functionality is implemented by making use of CUDA
+streams@footnote{See "Stream Management" in "CUDA Driver API",
+TRM-06703-001, Version 5.5, July 2013, for additional information}.
+
+The primary means by which the asychronous functionality is accessed
+is through the use of those OpenACC directives which make use of the

s/which/that/

+@code{async} and @code{wait} clauses.  When the @code{async} clause is
+first used with a directive, it will create a CUDA stream.  If an

s/will create/creates/

+@code{async-argument} is used with the @code{async} clause, then the
+stream will be associated with the specified @code{async-argument}.

s/will be/is/

+
+Following the creation of an association between a CUDA stream and the
+@code{async-argument} of an @code{async} clause, both the @code{wait}
+clause and the @code{wait} directive can be used.  When either the
+clause or directive is used after stream creation, it creates a
+rendezvous point whereby execution will wait until all operations

s/will wait/waits/

+associated with the @code{async-argument}, that is, stream, have
+completed.
+
+Normally, the management of the streams that are created as a result of
+using the @code{async} clause, is done without any intervention by the
+caller.  This implies the association between the @code{async-argument}

You've got an unnecessary comma there. I think this would be easier to parse if rewritten "Normally, streams that are created as a result of using the @code{async} clause are managed without any intervention by the caller."

+and the CUDA stream will be maintained for the lifetime of the program.

s/will be/is/

+However, this association can be changed through the use of the library
+function @code{acc_set_cuda_stream}.  When the function
+@code{acc_set_cuda_stream} is used, the CUDA stream that was

s/is used/is called/ ??

+originally associated with the @code{async} clause will be destroyed.

s/will be/is/

+Caution should be taken when changing the association as subsequent
+references to the @code{async-argument} will be referring to a different

s/will be referring/refer/

+As the OpenACC library is built using the CUDA Driver API, the question has
+arisen on what impact does using the OpenACC library have on a program that
+uses the Runtime library, or a library based on the Runtime library, e.g.,
+CUBLAS@footnote{See section 2.26, "Interactions with the CUDA Driver API" in
+"CUDA Runtime API", Version 5.5, July 2013 and section 2.27, "VDPAU
+Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
+July 2013, for additional information on library interoperability.}.

This is really hard to parse.  Can we say something like

The OpenACC library uses the CUDA Driver API, and may interact with programs that use the Runtime library directly, or another library based on the Runtime library....

+This chapter will describe the use cases and what changes are
+required in order to use both the OpenACC library and the CUBLAS and Runtime
+libraries within a program.

s/will describe/describes/
s/what changes are required in order/the requirements/  ??

+
+@section First invocation: NVIDIA CUBLAS library API
+
+In this first use case (see below), a function in the CUBLAS library is called
+prior to any of the functions in the OpenACC library. More specifically, the
+function @code{cublasCreate()}.
+
+When invoked, the function will initialize the library and allocate the
]
s/will initialize/initializes/
s/allocate/allocates/

+hardware resources on the host and the device on behalf of the caller. Once
+the initialization and allocation has completed, a handle is returned to the
+caller. The OpenACC library also requires initialization and allocation of
+hardware resources. Since the CUBLAS library has already allocated the
+hardware resources for the device, all that is left to do is to initialize
+the OpenACC library and acquire the hardware resources on the host.
+
+Prior to calling the OpenACC function that will initialize the library and
+allocate the host hardware resources, one needs to acquire the device number

s/will initialize/initializes/
s/allocate/allocates/
s/one needs/you need/

+that was allocated during the call to @code{cublasCreate()}. The invoking of 
the

The GNU coding standards say "Please do not write ‘()’ after a function name just to indicate it is a function." There are too many instances of this in the following text for me to pick them out individually.... just search and replace, please.

s/The invoking of the/Invoking/

+runtime library function @code{cudaGetDevice()} will accomplish this. Once

s/will accomplish/accomplishes/

+acquired, the device number is passed along with the device type as
+parameters to the OpenACC library function @code{acc_set_device_num()}.
+
+Once the call to @code{acc_set_device_num()} has completed, the OpenACC
+library will be using the  context that was created during the call to

s/will be using/uses/

+@code{cublasCreate()}. In other words, both libraries will be sharing the

s/will be sharing/share/

+same context.
+
+@verbatim

I think code examples should use @smallexample, not @verbatim.

+@section First invocation: OpenACC library API
+
+In this second use case (see below), a function in the OpenACC library is
+called prior to any of the functions in the CUBLAS library. More specificially,
+the function acc_set_device_num().

Need @code markup on that.

+
+In the use case presented here, the function @code{acc_set_device_num()}
+is used to both initialize the OpenACC library and allocate the hardware
+resources on the host and the device. In the call to the function, the
+call parameters specify which device to use, i.e., 'dev', and what device

What is the purpose of these quotes instead of real markup? I don't see 'dev' referenced at all in the subsequent paragraphs, so does it need to be named at all?

+type to use, i.e., @code{acc_device_nvidia}. It should be noted that this
+is but one method to initialize the OpenACC library and allocate the
+appropriate hardware resources. Other methods are available through the
+use of environment variables and these will be discussed in the next section.
+
+Once the call to @code{acc_set_device_num()} has completed, other OpenACC
+functions can be called as seen with multiple calls being made to
+@code{acc_copyin()}. In addition, calls can be made to functions in the
+CUBLAS library. In the use case a call to @code{cublasCreate()} is made
+subsequent to the calls to @code{acc_copyin()}.
+As seen in the previous use case, a call to @code{cublasCreate()} will
+initialize the CUBLAS library and allocate the hardware resources on the

s/will initialize/initializes/
s/allocate/allocates/

+host and the device.  However, since the device has already been allocated,
+@code{cublasCreate()} will only initialize the CUBLAS library and allocate

s/will only initialize/only initializes/
s/allocate/allocates/

+the appropriate hardware resources on the host. The context that was created
+as part of the OpenACC initialization will be shared with the CUBLAS library,

s/will be shared/is shared/

+@section OpenACC library and environment variables
+
+There are two environment variables associated with the OpenACC library that
+may be used to control the device type and device number.
+Namely, @env{ACC_DEVICE_TYPE} and @env{ACC_DEVICE_NUM}. In the second

Namely, that sentence no verb. :-P

How about joining the clause to the first sentence:

s/.  Namely,/:/

+use case, the device type and device number were specified using

s/were/are/

+@code{acc_set_device_num()}. However, @env{ACC_DEVICE_TYPE} and
+@env{ACC_DEVICE_NUM} could have been defined and the call to
+@code{acc_set_device_num()} would be not be required. At the time of the
+call to @code{acc_copyin()}, these two environment variables would be
+sampled and their values used.

This is really hard to parse because of the "could have"/"would" stuff. I am guessing what you want to say is something like:

If your program does not call @code{acc_set_device_num}, @code{acc_copyin} uses these environment variables instead.

+The use of the environment variables is only relevant when an OpenACC function
+is called prior to a call to @code{cudaCreate()}. If @code{cudaCreate()}
+is called prior to a call to an OpenACC function, then a call to
+@code{acc_set_device_num()}, must be done@footnote{More complete information

Another extra comma there, and this would be more directly phrased as

If @code{cudeCreate} is called prior to a call to an OpenACC function, then you must call @code{acc_set_device_num}.

-Sandra the nit-picky

Reply via email to