Hi Tobias! On 2019-10-24T14:47:58+0200, Tobias Burnus <tobias_bur...@mentor.com> wrote: > The clause (new in OpenACC 2.6) makes any device code use the local > memory address for each of the variables specified unless the given > variable is already present on the current device. – Or in words of > OpenACC 2.7 (in Sect. 2.7.9 no_create clause): > > "The no_create clause may appear on structured data and compute > constructs." / "For each var in varlist, if var is in shared memory, no > action is taken; if var is not in shared memory, the no_create clause > behaves as follows:" [digest: if present, update present count, if > pointer attach/detach; if not not present, device-local memory used.]
s%not not%not% s%device-local%local% > "The restrictions regarding subarrays in the present clause apply to > this clause." > Note: The "no_create" maps to the (new) GOMP_MAP_NO_ALLOC in the middle > end – and all action in libgomp/target.c but only applies to > GOMP_MAP_NO_ALLOC; hence, the code should only affect OpenACC. Not sure if 'GOMP_MAP_NO_ALLOC' is the most descriptive name. ;-) I understand 'no_create' to mean 'present' in combination with an 'if_present' flag that is available as a clause for some (other) OpenACC directives, correct? So, how about naming this 'GOMP_MAP_IF_PRESENT' instead of 'GOMP_MAP_NO_ALLOC'? (Jakub?) (But I don't care too much, so if there's a good reason to prefer 'GOMP_MAP_NO_ALLOC', then that's fine, too.) Ah, I just found that Julian (CCed for your information) internally had proposed 'GOMP_MAP_MAYBE_PRESENT', which seems like another good option indeed. For reference: > --- a/include/gomp-constants.h > +++ b/include/gomp-constants.h > @@ -75,6 +75,8 @@ enum gomp_map_kind > GOMP_MAP_DEVICE_RESIDENT = (GOMP_MAP_FLAG_SPECIAL_1 | 1), > /* OpenACC link. */ > GOMP_MAP_LINK = (GOMP_MAP_FLAG_SPECIAL_1 | 2), > + /* Use device data if present, fall back to host address otherwise. */ > + GOMP_MAP_NO_ALLOC = (GOMP_MAP_FLAG_SPECIAL_1 | 3), > /* Do not map, copy bits for firstprivate instead. */ > GOMP_MAP_FIRSTPRIVATE = (GOMP_MAP_FLAG_SPECIAL | 0), > /* Similarly, but store the value in the pointer rather than > OK for the trunk? To synchronize our efforts, I'm attaching an incremental WIP patch. Will you please have a look at that, merging it in, while I continue to review? > PS: This patch is a re-diffed version of the OG9/OG8 version; as some > other features are not yet on trunk, it misses a test case for > "no_create(s.y…)" (i.e. the struct component-ref; > libgomp/testsuite/libgomp.oacc-c-c++-common/nocreate-{3,4}.c); trunk > also lacks 'acc serial' and, hence, the attach patch lacks the > OACC_SERIAL_CLAUSE_MASK updates – and gfc_match_omp_map_clause needs > later to be updated for the allow_derived and allow_common arguments. > Furthermore, some 'do_detach = false' are missing in libgomp/target.c as > they do not yet exist on trunk, either. > > The openacc-gcc-9 /…-8 branch patch is commit > 8e74c2ec2b90819c995444370e742864a685209f of Dec 20, 2018. It has been > posted as https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01418.html Thanks for providing these references, that's useful. > libgomp/ > * testsuite/libgomp.oacc-c-c++-common/nocreate-1.c: New test. > * testsuite/libgomp.oacc-c-c++-common/nocreate-2.c: New test. > * testsuite/libgomp.oacc-fortran/nocreate-1.f90: New test. > * testsuite/libgomp.oacc-fortran/nocreate-2.f90: New test. Please rename these files to 'no_create*', as that's what the clause is called. ..., and then: > --- a/libgomp/target.c > +++ b/libgomp/target.c > @@ -667,6 +667,12 @@ gomp_map_vars_internal (struct gomp_device_descr > *devicep, > has_firstprivate = true; > continue; > } > + else if ((kind & typemask) == GOMP_MAP_NO_ALLOC) > + { > + tgt->list[i].key = NULL; > + tgt->list[i].offset = 0; > + continue; > + } > cur_node.host_start = (uintptr_t) hostaddrs[i]; > if (!GOMP_MAP_POINTER_P (kind & typemask)) > cur_node.host_end = cur_node.host_start + sizes[i]; > @@ -892,6 +898,49 @@ gomp_map_vars_internal (struct gomp_device_descr > *devicep, > cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset > + cur_node.host_start - n->host_start; > continue; > + case GOMP_MAP_NO_ALLOC: > + { > + cur_node.host_start = (uintptr_t) hostaddrs[i]; > + cur_node.host_end = cur_node.host_start + sizes[i]; > + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); > + if (n != NULL) > + { > + tgt->list[i].key = n; > + tgt->list[i].offset = cur_node.host_start - n->host_start; > + tgt->list[i].length = n->host_end - n->host_start; > + tgt->list[i].copy_from = false; > + tgt->list[i].always_copy_from = false; > + n->refcount++; > + } > + else > + { > + tgt->list[i].key = NULL; > + tgt->list[i].offset = OFFSET_INLINED; > + tgt->list[i].length = sizes[i]; > + tgt->list[i].copy_from = false; > + tgt->list[i].always_copy_from = false; > + if (i + 1 < mapnum) > + { > + int kind2 = get_kind (short_mapkind, kinds, i + 1); > + switch (kind2 & typemask) > + { > + case GOMP_MAP_POINTER: > + /* The data is not present but we have an attach > + or pointer clause next. Skip over it. */ > + i++; > + tgt->list[i].key = NULL; > + tgt->list[i].offset = OFFSET_INLINED; > + tgt->list[i].length = sizes[i]; > + tgt->list[i].copy_from = false; > + tgt->list[i].always_copy_from = false; > + break; > + default: > + break; > + } > + } > + } > + continue; > + } > default: > break; > } This I don't grok yet; see the "TODO" comments in the attached incremental WIP patch. Grüße Thomas
From 22ceeb89f787a6279a37d35965f82a4f5b3e3b72 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Wed, 6 Nov 2019 00:42:06 +0100 Subject: [PATCH] [WIP] into Add OpenACC 2.6 `no_create' clause support --- gcc/fortran/openmp.c | 4 ++-- .../gfortran.dg/goacc/common-block-1.f90 | 3 +++ .../gfortran.dg/goacc/common-block-2.f90 | 3 +++ .../gfortran.dg/goacc/data-clauses.f95 | 23 ++++++++++++++++++- gcc/testsuite/gfortran.dg/goacc/data-tree.f95 | 3 ++- .../gfortran.dg/goacc/kernels-tree.f95 | 3 ++- .../gfortran.dg/goacc/parallel-tree.f95 | 3 ++- libgomp/target.c | 8 +++++++ .../libgomp.oacc-fortran/common-block-2.f90 | 4 +++- .../libgomp.oacc-fortran/nocreate-1.f90 | 10 ++++++-- .../libgomp.oacc-fortran/nocreate-2.f90 | 16 ++++++++++--- 11 files changed, 68 insertions(+), 12 deletions(-) diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 47c5cf5d422..822af5dbe7c 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -1449,7 +1449,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_NO_CREATE) && gfc_match ("no_create ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_NO_ALLOC)) + OMP_MAP_NO_ALLOC, true)) continue; if ((mask & OMP_CLAUSE_NOGROUP) && !c->nogroup @@ -1969,7 +1969,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \ | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT \ | OMP_CLAUSE_CREATE | OMP_CLAUSE_NO_CREATE | OMP_CLAUSE_PRESENT \ - | OMP_CLAUSE_DEFAULT | OMP_CLAUSE_WAIT) + | OMP_CLAUSE_DEFAULT | OMP_CLAUSE_WAIT) #define OACC_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR | OMP_CLAUSE_COPY \ | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE \ diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 index ea437526b46..5c162a5b884 100644 --- a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 @@ -51,6 +51,9 @@ program test !$acc data pcopyout(/blockA/, /blockB/, e, v) !$acc end data + !$acc data no_create(/blockA/, /blockB/, e, v) + !$acc end data + !$acc parallel private(/blockA/, /blockB/, e, v) !$acc end parallel diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 index 1ba945019f9..33c0d3f5fb4 100644 --- a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 @@ -39,6 +39,9 @@ program test !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" } !$acc end data + !$acc data no_create(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" } + !$acc end data + !$acc parallel private(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" } !$acc end parallel diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 index b94214e8b63..c1b3e1dec38 100644 --- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 @@ -111,6 +111,27 @@ contains !$acc end data + !$acc parallel no_create (tip) ! { dg-error "POINTER" } + !$acc end parallel + !$acc parallel no_create (tia) ! { dg-error "ALLOCATABLE" } + !$acc end parallel + !$acc parallel deviceptr (i) no_create (i) ! { dg-error "multiple clauses" } + !$acc end parallel + !$acc parallel copy (i) no_create (i) ! { dg-error "multiple clauses" } + !$acc end parallel + !$acc parallel copyin (i) no_create (i) ! { dg-error "multiple clauses" } + !$acc end parallel + !$acc parallel copyout (i) no_create (i) ! { dg-error "multiple clauses" } + !$acc end parallel + + !$acc parallel no_create (i, c, r, ia, ca, ra, asa, rp, ti, vi, aa) + !$acc end parallel + !$acc kernels no_create (i, c, r, ia, ca, ra, asa, rp, ti, vi, aa) + !$acc end kernels + !$acc data no_create (i, c, r, ia, ca, ra, asa, rp, ti, vi, aa) + !$acc end data + + !$acc parallel present (tip) ! { dg-error "POINTER" } !$acc end parallel !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" } @@ -256,4 +277,4 @@ contains !$acc end data end subroutine foo -end module test \ No newline at end of file +end module test diff --git a/gcc/testsuite/gfortran.dg/goacc/data-tree.f95 b/gcc/testsuite/gfortran.dg/goacc/data-tree.f95 index f16d62cce69..454417d6a05 100644 --- a/gcc/testsuite/gfortran.dg/goacc/data-tree.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/data-tree.f95 @@ -7,6 +7,7 @@ program test logical :: l = .true. !$acc data if(l) copy(i), copyin(j), copyout(k), create(m) & + !$acc no_create(n) & !$acc present(o), pcopy(p), pcopyin(r), pcopyout(s), pcreate(t) & !$acc deviceptr(u) !$acc end data @@ -19,7 +20,7 @@ end program test ! { dg-final { scan-tree-dump-times "map\\(to:j\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(from:k\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(alloc:m\\)" 1 "original" } } - +! { dg-final { scan-tree-dump-times "map\\(no_alloc:n\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(force_present:o\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(tofrom:p\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(to:r\\)" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 b/gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 index a70f1e737bd..5583ffb4d04 100644 --- a/gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/kernels-tree.f95 @@ -8,6 +8,7 @@ program test !$acc kernels if(l) async num_gangs(i) num_workers(i) vector_length(i) & !$acc copy(i), copyin(j), copyout(k), create(m) & + !$acc no_create(n) & !$acc present(o), pcopy(p), pcopyin(r), pcopyout(s), pcreate(t) & !$acc deviceptr(u) !$acc end kernels @@ -25,7 +26,7 @@ end program test ! { dg-final { scan-tree-dump-times "map\\(to:j\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(from:k\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(alloc:m\\)" 1 "original" } } - +! { dg-final { scan-tree-dump-times "map\\(no_alloc:n\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(force_present:o\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(tofrom:p\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(to:r\\)" 1 "original" } } diff --git a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 index 2697bb79e7f..e33653bdd78 100644 --- a/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/parallel-tree.f95 @@ -9,6 +9,7 @@ program test !$acc parallel if(l) async num_gangs(i) num_workers(i) vector_length(i) & !$acc reduction(max:q), copy(i), copyin(j), copyout(k), create(m) & + !$acc no_create(n) & !$acc present(o), pcopy(p), pcopyin(r), pcopyout(s), pcreate(t) & !$acc deviceptr(u), private(v), firstprivate(w) !$acc end parallel @@ -28,7 +29,7 @@ end program test ! { dg-final { scan-tree-dump-times "map\\(to:j\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(from:k\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(alloc:m\\)" 1 "original" } } - +! { dg-final { scan-tree-dump-times "map\\(no_alloc:n\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(force_present:o\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(tofrom:p\\)" 1 "original" } } ! { dg-final { scan-tree-dump-times "map\\(to:r\\)" 1 "original" } } diff --git a/libgomp/target.c b/libgomp/target.c index 632e7020538..0338648946d 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -669,6 +669,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, } else if ((kind & typemask) == GOMP_MAP_NO_ALLOC) { + //TODO TS is confused. Handling this here, will inhibit 'gomp_map_vars_existing' being used a bit further below. tgt->list[i].key = NULL; tgt->list[i].offset = 0; continue; @@ -905,6 +906,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); if (n != NULL) { + //TODO TS is confused. Due to the way the handling of 'GOMP_MAP_NO_ALLOC' is done in the first loop, we're here re-doing 'gomp_map_vars_existing'? tgt->list[i].key = n; tgt->list[i].offset = cur_node.host_start - n->host_start; tgt->list[i].length = n->host_end - n->host_start; @@ -914,6 +916,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, } else { + //TODO This is basically 'GOMP_MAP_FIRSTPRIVATE_INT' handling? tgt->list[i].key = NULL; tgt->list[i].offset = OFFSET_INLINED; tgt->list[i].length = sizes[i]; @@ -925,6 +928,11 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, switch (kind2 & typemask) { case GOMP_MAP_POINTER: + //TODO abort(); + //TODO This code path is exercised by 'libgomp.oacc-fortran/nocreate-2.f90'. + //TODO TS does not yet understand why this is needed. + //TODO Is this somehow similar to 'GOMP_MAP_TO_PSET' handling? + /* The data is not present but we have an attach or pointer clause next. Skip over it. */ i++; diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 index 018b37d00bb..ad04ca997c2 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 @@ -76,7 +76,9 @@ program main !$acc enter data create(b) - !$acc parallel loop pcopy(b) + !$acc parallel loop & + !$acc no_create(b) ! ... here means 'present(b)'. + !TODO But we get: "libgomp: cuStreamSynchronize error: an illegal memory access was encountered". do i = 1, n b(i) = i end do diff --git a/libgomp/testsuite/libgomp.oacc-fortran/nocreate-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/nocreate-1.f90 index f048355d7df..ca9611b777c 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/nocreate-1.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/nocreate-1.f90 @@ -1,20 +1,26 @@ -! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } +! { dg-do run } ! Test no_create clause with data construct when data is present/not present. program nocreate use openacc implicit none + logical :: shared_memory integer, parameter :: n = 512 integer :: myarr(n) integer i + shared_memory = .false. + !$acc kernels copyin (shared_memory) + shared_memory = .true. + !$acc end kernels + do i = 1, n myarr(i) = 0 end do !$acc data no_create (myarr) - if (acc_is_present (myarr)) stop 1 + if (acc_is_present (myarr) .neqv. shared_memory) stop 1 !$acc end data !$acc enter data copyin (myarr) diff --git a/libgomp/testsuite/libgomp.oacc-fortran/nocreate-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/nocreate-2.f90 index 34444ecf5b0..16227b8ae22 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/nocreate-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/nocreate-2.f90 @@ -1,14 +1,20 @@ -! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } +! { dg-do run } ! Test no_create clause with data/parallel constructs. program nocreate use openacc implicit none + logical :: shared_memory integer, parameter :: n = 512 integer :: myarr(n) integer i + shared_memory = .false. + !$acc kernels copyin (shared_memory) + shared_memory = .true. + !$acc end kernels + do i = 1, n myarr(i) = 0 end do @@ -16,7 +22,11 @@ program nocreate call do_on_target(myarr, n) do i = 1, n - if (myarr(i) .ne. i) stop 1 + if (shared_memory) then + if (myarr(i) .ne. i * 2) stop 1 + else + if (myarr(i) .ne. i) stop 2 + end if end do do i = 1, n @@ -28,7 +38,7 @@ program nocreate !$acc exit data copyout(myarr) do i = 1, n - if (myarr(i) .ne. i * 2) stop 2 + if (myarr(i) .ne. i * 2) stop 3 end do end program nocreate -- 2.17.1
signature.asc
Description: PGP signature