Hi Tobias!

On 2024-02-19T22:36:51+0100, Tobias Burnus <tbur...@baylibre.com> wrote:
> While waiting for some testing to finish, I got distracted and added the
> very low hanging OpenACC 3.3 fruits, i.e. those Fortran routines that directly
> map to their C counter part.
>
> Comments, remarks?

Thanks, that largely looks straight-forward.  I've not done an in-depth
review, just a few comments.  Resolve these as you think is necessary,
and then 'git push'.

I don't know much about Fortran interfaces -- I trust you've got that
under control.  ;-)

Thanks for the test cases.  Would be nice to have test cases covering all
interfaces -- but I don't think we're currently complete in that regard,
so shall not hold your contribution to higher standards.

> OpenACC: Add Fortran routines 
> acc_{alloc,free,hostptr,deviceptr,memcpy_{to,from}_device*}
>
> These routines map simply to the C counterpart and are meanwhile
> defined in OpenACC 3.3. (There are additional routine changes,
> including the Fortran addition of acc_attach/acc_detach, that
> require more work than a simple addition of an interface and
> are therefore excluded.)

I saw:

  - <https://gcc.gnu.org/PR113997> "Bogus 'Warning: Interface mismatch in 
global procedure' with C binding"
  - <https://gcc.gnu.org/PR114002> "[OpenACC][OpenACC 3.3] Add 
'acc_attach'/'acc_detach' routine"

> --- a/libgomp/libgomp.texi
> +++ b/libgomp/libgomp.texi

>  @section @code{acc_malloc} -- Allocate device memory.
>  @table @asis
>  @item @emph{Description}
> -This function allocates @var{len} bytes of device memory. It returns
> +This function allocates @var{bytes} of device memory. It returns

Not '@var{bytes} {+bytes+}' or similar?

>  @section @code{acc_memcpy_to_device} -- Copy host memory to device memory.

>  @item @emph{C/C++}:
>  @multitable @columnfractions .20 .80
> -@item @emph{Prototype}: @tab @code{acc_memcpy_to_device(d_void *dest, h_void 
> *src, size_t bytes);}
> +@item @emph{Prototype}: @tab @code{void acc_memcpy_to_device(d_void* 
> data_dev_dest,}
> +@item                   @tab @code{h_void* data_host_src, size_t bytes);}
> +@item @emph{Prototype}: @tab @code{void acc_memcpy_to_device_async(d_void* 
> data_dev_dest,}
> +@item                   @tab @code{h_void* data_host_src, size_t bytes, int 
> async_arg);}
> +@end multitable
> +
> +@item @emph{Fortran}:
> +@multitable @columnfractions .20 .80
> +@item @emph{Interface}: @tab @code{subroutine 
> acc_memcpy_to_device(data_dev_dest, &}
> +@item                   @tab @code{data_host_src, bytes)}
> +@item @emph{Interface}: @tab @code{subroutine 
> acc_memcpy_to_device_async(data_dev_dest, &}
> +@item                   @tab @code{data_host_src, bytes, async_arg)}
> +@item                   @tab @code{type(c_ptr), value :: data_dev_dest}
> +@item                   @tab @code{type(*), dimension(*) :: data_host_src}
> +@item                   @tab @code{integer(c_size_t), value :: bytes}
> +@item                   @tab @code{integer(acc_handle_kind), value :: 
> async_arg}
>  @end multitable

I did wonder whether we should (here, and elsewhere) also update the
'@menu' in "OpenACC Runtime Library Routines" to list the 'async'
routines -- but the OpenACC specification also doesn't, so it shall be
fine as is here, too.

>  @item @emph{Reference}:
>  @uref{https://www.openacc.org, OpenACC specification v2.6}, section
> -3.2.31.
> +3.2.31  @uref{https://www.openacc.org, OpenACC specification v3.3}, section

(Fine as is, of course, but could -- generally -- simplify the 'diff' by
starting the new '@uref' on its own line.)

> +3.2.26..

Double '.'.

> --- a/libgomp/openacc.f90
> +++ b/libgomp/openacc.f90
> @@ -758,6 +758,93 @@ module openacc_internal
>        integer (c_int), value :: async
>      end subroutine
>    end interface
> +
> +  interface
> +    type(c_ptr) function acc_malloc (bytes) bind(C)
> +[...]
> +    end subroutine
> +  end interface
>  end module openacc_internal

Assuming that 'module openacc_internal' currently is sorted per
appearance in the OpenACC specification (?), I suggest we continue to do
so.  (..., like in 'openacc_lib.h', too.)

> @@ -794,6 +881,9 @@ module openacc
>    public :: acc_copyin_async, acc_create_async, acc_copyout_async
>    public :: acc_delete_async, acc_update_device_async, acc_update_self_async
>    public :: acc_copyout_finalize, acc_delete_finalize
> +  public :: acc_malloc, acc_free, acc_map_data, acc_unmap_data, acc_deviceptr
> +  public :: acc_hostptr, acc_memcpy_to_device, acc_memcpy_to_device_async
> +  public :: acc_memcpy_from_device, acc_memcpy_from_device_async

Likewise.

> @@ -871,9 +961,6 @@ module openacc
>      procedure :: acc_on_device_h
>    end interface
>  
> -  ! acc_malloc: Only available in C/C++
> -  ! acc_free: Only available in C/C++
> -
>    ! As vendor extension, the following code supports both 32bit and 64bit
>    ! arguments for "size"; the OpenACC standard only permits default-kind
>    ! integers, which are of kind 4 (i.e. 32 bits).
> @@ -953,20 +1040,12 @@ module openacc
>      procedure :: acc_update_self_array_h
>    end interface
>  
> -  ! acc_map_data: Only available in C/C++
> -  ! acc_unmap_data: Only available in C/C++
> -  ! acc_deviceptr: Only available in C/C++
> -  ! acc_hostptr: Only available in C/C++
> -
>    interface acc_is_present
>      procedure :: acc_is_present_32_h
>      procedure :: acc_is_present_64_h
>      procedure :: acc_is_present_array_h
>    end interface
>  
> -  ! acc_memcpy_to_device: Only available in C/C++
> -  ! acc_memcpy_from_device: Only available in C/C++
> -
>    interface acc_copyin_async
>      procedure :: acc_copyin_async_32_h
>      procedure :: acc_copyin_async_64_h

Is that now a different style that we're not listing the new interfaces
in 'module openacc' here?

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/acc_host_device_ptr.f90
> @@ -0,0 +1,43 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +! Fortran version of libgomp.oacc-c-c++-common/lib-59.c 

I like to also put a cross reference into the originating C/C++ test
case, so that anyone adjusting either one also is aware that another one
may need adjusting, too.

> +  ! The following assumes sizeof(void*) being the same on host and device:

That's generally required anyway.

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/acc-memcpy.f90
> @@ -0,0 +1,47 @@
> +! { dg-do run }
> +! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
> +
> +! based on libgomp.oacc-c-c++-common/lib-60.c

Likewise.


Grüße
 Thomas

Reply via email to