Hi,
I would like to ask some comments about a small piece of code that
reflect many algorithms in a large application I try to move to openACC
with gfortran. This code manage a pointer to a user defined type with an
allocatable.
I've no problem to allocate and work with this type on the cpu and on
the gpu but a problem raise when I try to reallocate the allocatable
part to a different size (in the "do iter=1,2" loop of the main
program). Each time ligomp raises:
libgomp: cuStreamSynchronize error: an illegal memory access was encountered
I've followed Tobias Burnus suggestions in a previous thread with
options like :
gfortran -g -fopenacc -cpp -fdump-tree-original -fdump-tree-gimple
-fdump-tree-omplower -o testcase_begou testcase_begou.f90
but I've some difficulties to understand the details like:
#pragma acc parallel map(force_present:*tab) map(alloc:tab [pointer
assign, bias: 0]) collapse(2)
for line 80 in the file:
!$acc parallel loop present(tab) collapse(2)
as it seams to allocate something on the GPU while it's yet available?
Not sure.
As I try for several weeks to modify this code (it works with Nvfortran
and Cray Fortran but still not with GNU) to solve the problem I would
like to ask some OpenACC experts if this code is valid or just benefit
of some compiler indulgence. I've open some month ago a PR
https://gcc.gnu.org/bugzilla/attachment.cgi?id=54970, but if the problem
is a bad fortran code I will have to remove also this report as they are
many reports waiting to be processed yet. At the opposite, if it is a
compiler problem I should stop spending time on this piece of code and
wait for Gfortran improvement.
Thanks for your advices
Patrick
module tab_m
#ifdef _OPENACC
use openacc, only: acc_is_present
#endif
implicit none
type r2tab
double precision, dimension(:,:), allocatable :: val
integer :: dim1
integer :: dim2
end type r2tab
contains
subroutine ajoute(tab,n,m)
implicit none
integer, intent(in) ::n,m
type(r2tab), pointer, intent(inout) ::tab
!----------------------------
print*,"=====>> ajoute: allocating on host and on device"
if (.not. associated(tab)) allocate(tab)
if (allocated(tab%val)) deallocate(tab%val)
allocate (tab%val(n,m))
tab%dim1=n
tab%dim2=m
tab%val(:,:)=0.0D0
!$acc enter data copyin(tab)
!$acc enter data copyin(tab%val)
print*,"=====>> ajoute OK"
end subroutine ajoute
subroutine destroy(tab)
implicit none
type(r2tab), pointer, intent(inout) ::tab
print*,"=====>> destroy datas on device and on host"
if (associated(tab)) then
if (allocated(tab%val)) then
!$acc exit data delete(tab%val)
deallocate(tab%val)
endif
!$acc exit data delete(tab)
deallocate(tab)
endif
nullify(tab)
print*,"=====>> destroy OK"
end subroutine destroy
end module tab_m
program main
use tab_m, only: ajoute, destroy, r2tab
use openacc
implicit none
type(r2tab), pointer :: tab=>null()
integer :: i,j,k,iter
double precision :: somme
double precision :: defval=2
do iter=1,2
write(6,'(a)') "================================================="
write(6,'(a,1x,i0,1x,a)') "=======================",iter,"======================="
write(6,'(a)') "================================================="
! Allocate memory on the host and on the device.
call ajoute (tab,10*iter,10*iter)
write(6,'(a,i0,a)')" ====> usage in main: set array to ",iter," on device"
!$acc parallel loop present(tab) collapse(2)
do j=1,tab%dim2
do i=1,tab%dim1
tab%val(i,j)=1.0 *iter
enddo
enddo
print*," ====> usage in main OK"
! Check values now
somme=0
do j=1,tab%dim2
do i=1,tab%dim1
somme=somme+tab%val(i,j)
end do
end do
write(6,'(a,f14.6)') 'Before update from the device, on host (should be 0): ',somme
!$acc update self(tab%val)
somme=0
do j=1,tab%dim2
do i=1,tab%dim1
somme=somme+tab%val(i,j)
end do
end do
write(6,'(a,i0,a,f14.6)') 'After update from device, on host shoud be ',&
& iter*tab%dim2*tab%dim1,': ',somme
call destroy(tab)
tab=>null()
end do ! iter loop
end program main