On Thu, 11 Apr 2024, Thomas Schwinge wrote:

> Hi Chung-Lin, Richard!
> 
> From me just a few mechanical pieces, see below.  Richard, are you able
> to again comment on Chung-Lin's general strategy, as I'm not at all
> familiar with those parts of the code?

I've queued all stage1 material and will be only able to slowly look
at it after we released.

> On 2024-04-03T19:50:55+0800, Chung-Lin Tang <clt...@pllab.cs.nthu.edu.tw> 
> wrote:
> > On 2023/10/30 8:46 PM, Richard Biener wrote:
> >>>
> >>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> >>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> >>> flag.
> >>>
> >>> The actual optimization then is done in this second patch.  Chung-Lin
> >>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> >>> I don't have much experience with most of the following generic code, so
> >>> would appreciate a helping hand, whether that conceptually makes sense as
> >>> well as from the implementation point of view:
> >
> > First of all, I have removed all of the gimplify-stage scanning and setting 
> > of
> > DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no 
> > changes to
> > gimplify.cc now)
> >
> > I remember this code was an artifact of earlier attempts to allow 
> > struct-member
> > pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed 
> > anyways.
> > I think the omp_data_* member accesses when building child function side
> > receiver_refs is blocking points-to analysis from working (didn't try 
> > digging deeper)
> >
> > Also during gimplify, VAR_DECLs appeared to be reused (at least in some 
> > cases) for map
> > clause decl reference building, so hoping that the variables "happen to be" 
> > single-use and
> > DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY 
> > does appear to be
> > a little risky.
> >
> > However, for firstprivate pointers processed during omp-low, it appears to 
> > be somewhat different.
> > (see below description)
> >
> >> No, I don't think you can use that flag on non-default-defs, nor
> >> preserve it on copying.  So
> >> it also doesn't nicely extend to DECLs as done by the patch.  We
> >> currently _only_ use it
> >> for incoming parameters.  When used on arbitrary code you can get to for 
> >> example
> >> 
> >> ptr1(points-to-readony-memory) = &p->x;
> >> ... access via ptr1 ...
> >> ptr2 = &p->x;
> >> ... access via ptr2 ...
> >> 
> >> where both are your OMP regions differently constrained (the constrain is 
> >> on the
> >> code in the region, _not_ on the actual protections of the pointed to
> >> data, much like
> >> for the fortran case).  But now CSE comes along and happily replaces all 
> >> ptr2
> >> with ptr2 in the second region and ... oops!
> >
> > Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 
> > in the second region"?
> >
> > That doesn't happen, because during omp-lower/expand, OMP target regions 
> > (which is all that
> > this applies currently) is separated into different individual child 
> > functions.
> >
> > (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during 
> > omp-lower, when
> > for firstprivate pointers (i.e. 'a' here) we set this bit when constructing 
> > the first load
> > of this pointer)
> >
> >   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> >   {
> >     foo (a, a[8]);
> >     r = a[8];
> >   }
> >   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> >   {
> >     foo (a, a[12]);
> >     r = a[12];
> >   }
> >
> > After omp-expand (before SSA):
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> > {
> >  ...
> >   <bb 5> :
> >   D.2962 = .omp_data_i->D.2947;
> >   a.8 = D.2962;
> >   r.1 = (*a.8)[12];
> >   foo (a.8, r.1);
> >   r.1 = (*a.8)[12];
> >   D.2965 = .omp_data_i->r;
> >   *D.2965 = r.1;
> >   return;
> > }
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> > {
> >   ...
> >   <bb 3> :
> >   D.2968 = .omp_data_i->D.2939;
> >   a.4 = D.2968;
> >   r.0 = (*a.4)[8];
> >   foo (a.4, r.0);
> >   r.0 = (*a.4)[8];
> >   D.2971 = .omp_data_i->r;
> >   *D.2971 = r.0;
> >   return;
> > }
> >
> > So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> > SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a 
> > default-def
> > for an PARM_DECL, at least conceptually.
> >
> > (If offloading was structured significantly differently, say if child 
> > functions
> > were separated much earlier before omp-lowering, than this 
> > readonly-modifier might
> > possibly be a direct application of 'r' in the "fn spec" attribute)
> >
> > Other changes since first version of patch include:
> > 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> > 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
> > 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to 
> > verify removal of a MEM load, also as Thomas suggested.
> 
> Thanks!
> 
> > I have re-tested this patch using mainline, with no regressions. Is this 
> > okay for mainline?
> 
> > 2024-04-03  Chung-Lin Tang  <clt...@baylibre.com>
> >
> > gcc/c-family/ChangeLog:
> >
> >     * c-omp.cc (c_omp_address_inspector::expand_array_base):
> >     Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> >     (c_omp_address_inspector::expand_component_selector): Likewise.
> >
> > gcc/fortran/ChangeLog:
> >
> >     * trans-openmp.cc (gfc_trans_omp_array_section):
> >     Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
> >
> > gcc/ChangeLog:
> >
> >     * gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
> >     for VAR_DECLs.
> >     * omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
> >     variables of receiver refs.
> >     * tree-pretty-print.cc (dump_omp_clause):
> >     Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
> >     (dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
> >     * tree-ssanames.cc (make_ssa_name_fn): Set
> >     SSA_NAME_POINTS_TO_READONLY_MEMORY if DECL_POINTS_TO_READONLY is set.
> >     * tree.h (DECL_POINTS_TO_READONLY): New macro.
> >     (OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.
> >
> > gcc/testsuite/ChangeLog:
> >
> >     * c-c++-common/goacc/readonly-1.c: Adjust testcase.
> >     * c-c++-common/goacc/readonly-2.c: New testcase.
> >     * gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
> 
> > --- a/gcc/c-family/c-omp.cc
> > +++ b/gcc/c-family/c-omp.cc
> > @@ -3907,6 +3907,8 @@ c_omp_address_inspector::expand_array_base (tree c,
> >      }
> >    else if (c2)
> >      {
> > +      if (OMP_CLAUSE_MAP_READONLY (c))
> > +   OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> >        OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> >        OMP_CLAUSE_CHAIN (c) = c2;
> >        if (implicit_p)
> > @@ -4051,6 +4053,8 @@ c_omp_address_inspector::expand_component_selector 
> > (tree c,
> >      }
> >    else if (c2)
> >      {
> > +      if (OMP_CLAUSE_MAP_READONLY (c))
> > +   OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> >        OMP_CLAUSE_CHAIN (c2) = OMP_CLAUSE_CHAIN (c);
> >        OMP_CLAUSE_CHAIN (c) = c2;
> >        c = c2;
> 
> (So this replaces the 'gcc/c/c-typeck.cc:handle_omp_array_sections',
> 'gcc/cp/semantics.cc:handle_omp_array_sections' changes of the previous
> patch revision?)
> 
> Are we sure that really only the 'else if (c2)' branches need to handle
> this, and explicitly not the preceding 'if (c3)' branches, too?  I
> suggest we add a comment and/or handling, as necessary.  If that makes
> sense, maybe handle for both 'c3', 'c2' via a 'bool readonly_p = [...]',
> similar to the existing 'bool implicit_p'?
> 
> > --- a/gcc/fortran/trans-openmp.cc
> > +++ b/gcc/fortran/trans-openmp.cc
> > @@ -2561,6 +2561,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
> > gfc_exec_op op,
> >    ptr2 = fold_convert (ptrdiff_type_node, ptr2);
> >    OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, ptrdiff_type_node,
> >                                      ptr, ptr2);
> > +  if (n->u.map.readonly)
> > +    OMP_CLAUSE_MAP_POINTS_TO_READONLY (node3) = 1;
> >  }
> >  
> >  static tree
> 
> > --- a/gcc/gimple-expr.cc
> > +++ b/gcc/gimple-expr.cc
> > @@ -385,6 +385,8 @@ copy_var_decl (tree var, tree name, tree type)
> >    DECL_CONTEXT (copy) = DECL_CONTEXT (var);
> >    TREE_USED (copy) = 1;
> >    DECL_SEEN_IN_BIND_EXPR_P (copy) = 1;
> > +  if (VAR_P (var))
> > +    DECL_POINTS_TO_READONLY (copy) = DECL_POINTS_TO_READONLY (var);
> >    DECL_ATTRIBUTES (copy) = DECL_ATTRIBUTES (var);
> >    if (DECL_USER_ALIGN (var))
> >      {
> 
> > --- a/gcc/omp-low.cc
> > +++ b/gcc/omp-low.cc
> > @@ -14003,6 +14003,8 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, 
> > omp_context *ctx)
> >             if (ref_to_array)
> >               x = fold_convert_loc (clause_loc, TREE_TYPE (new_var), x);
> >             gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue);
> > +           if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (c) && VAR_P (x))
> > +             DECL_POINTS_TO_READONLY (x) = 1;
> >             if ((is_ref && !ref_to_array)
> >                 || ref_to_ptr)
> >               {
> 
> > --- a/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-1.c
> > @@ -48,17 +48,17 @@ int main (void)
> >  
> >  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc declare 
> > map\\(to:y\\) map\\(readonly,to:s\\) map\\(readonly,to:x\\)" 1 "original" } 
> > } */
> >  
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 
> > \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 
> > \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 
> > \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 
> > \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:x\\\[0\\\] \\\[len: 
> > \[0-9\]+\\\]\\)" 1 "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 
> > 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 
> > 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 
> > 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 
> > 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 
> > "original" { target { c } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) map\\(readonly,to:\\*s.ptr 
> > \\\[len: \[0-9\]+\\\]\\) map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 
> > 0\\\]\\) map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 
> > "original" { target { c } } } } */
> >  
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > -/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) .+ 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,firstprivate:x \\\[pointer assign, bias: 0\\\]\\)" 1 
> > "original" { target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> > +/* { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(to:y\\\[0\\\] \\\[len: \[0-9\]+\\\]\\) 
> > map\\(readonly,to:\\*NON_LVALUE_EXPR <s.ptr> \\\[len: \[0-9\]+\\\]\\) 
> > map\\(pt_readonly,attach_detach:s.ptr \\\[bias: 0\\\]\\) 
> > map\\(readonly,to:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\)" 1 "original" { 
> > target { c++ } } } } */
> >  
> >  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache 
> > \\(readonly:x\\\[0\\\] \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> >  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc cache \\(y\\\[0\\\] 
> > \\\[len: \[0-9\]+\\\]\\);$" 4 "original" } } */
> 
> > --- /dev/null
> > +++ b/gcc/testsuite/c-c++-common/goacc/readonly-2.c
> > @@ -0,0 +1,16 @@
> > +/* { dg-additional-options "-O -fdump-tree-phiprop -fdump-tree-fre" } */
> > +
> > +#pragma acc routine
> > +extern void foo (int *ptr, int val);
> > +
> > +int main (void)
> > +{
> > +  int r, a[32];
> > +  #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> > +  {
> > +    foo (a, a[8]);
> > +    r = a[8];
> > +  }
> > +}
> > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> > MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
> > +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> > MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
> 
> In the tree where I've been testing your patch, I've not been seeing
> 'MEM[x]' but '(*x)', therefore:
> 
>     -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 2 "phiprop1" } } */
>     -/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> MEM\\\[\[^_\]+_\[0-9\]+\\(ptro\\)\\\]\\\[8\\\];" 1 "fre1" } } */
>     +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 2 "phiprop1" } } */
>     +/* { dg-final { scan-tree-dump-times "r\.\[_0-9\]+ = 
> \\(\\\*_\[0-9\]+\\(ptro\\)\\)\\\[8\\\];" 1 "fre1" } } */
> 
> Maybe that's due to something else in my (long...) Git branch; just make
> sure you've got PASSes here, eventually.
> 
> > --- a/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > +++ b/gcc/testsuite/gfortran.dg/goacc/readonly-1.f90
> > @@ -80,16 +80,16 @@ end program main
> >  ! The front end turns OpenACC 'declare' into OpenACC 'data'.
> >  !   { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:\\*b\\) map\\(alloc:b.+ map\\(to:\\*c\\) map\\(alloc:c.+" 
> > 1 "original" } }
> >  !   { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:g\\) map\\(to:h\\)" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ 
> > map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ 
> > map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ 
> > map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ 
> > map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(readonly,to:\\*.+ map\\(alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > -! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(readonly,to:a.+ map\\(alloc:a.+ map\\(readonly,to:b.+ map\\(alloc:b.+ 
> > map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel 
> > map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc kernels 
> > map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc serial 
> > map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc data 
> > map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(readonly,to:\\*.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:\\*.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:\\*.+ map\\(alloc:c.+" 1 "original" } }
> > +! { dg-final { scan-tree-dump-times "(?n)#pragma acc enter data 
> > map\\(readonly,to:a.+ map\\(pt_readonly,alloc:a.+ map\\(readonly,to:b.+ 
> > map\\(pt_readonly,alloc:b.+ map\\(to:c.+ map\\(alloc:c.+" 1 "original" } }
> >  
> >  ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache 
> > \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: 
> > .+\\\]\\) \\(readonly:\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data 
> > \\\[len: .+\\\]\\);" 8 "original" } }
> >  ! { dg-final { scan-tree-dump-times "(?n)#pragma acc cache 
> > \\(\\*\\(integer\\(kind=4\\)\\\[0:\\\] \\*\\) parm.*data \\\[len: 
> > .+\\\]\\);" 8 "original" } }
> 
> Can we also get an OpenACC/Fortran test case à la
> 'c-c++-common/goacc/readonly-2.c' to demonstrate this doing something?
> 
> > --- a/gcc/testsuite/gfortran.dg/pr67170.f90
> > +++ b/gcc/testsuite/gfortran.dg/pr67170.f90
> > @@ -28,4 +28,4 @@ end subroutine foo
> >  end module test_module
> >  end program
> >  
> > -! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\);" 1 "fre1" } 
> > }
> > +! { dg-final { scan-tree-dump-times "= \\*arg_\[0-9\]+\\(D\\)\\(ptro\\);" 
> > 1 "fre1" } }
> 
> Is it understood what's happening here, that this is the correct
> behavior?  I suppose so -- there's no actual change in behavior -- as
> this here doesn't trigger for OpenACC 'readonly' modifier, but just the
> pretty printer change for 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
> 
> > --- a/gcc/tree-pretty-print.cc
> > +++ b/gcc/tree-pretty-print.cc
> > @@ -915,6 +915,8 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
> > spc, dump_flags_t flags)
> >        pp_string (pp, "map(");
> >        if (OMP_CLAUSE_MAP_READONLY (clause))
> >     pp_string (pp, "readonly,");
> > +      if (OMP_CLAUSE_MAP_POINTS_TO_READONLY (clause))
> > +   pp_string (pp, "pt_readonly,");
> >        switch (OMP_CLAUSE_MAP_KIND (clause))
> >     {
> >     case GOMP_MAP_ALLOC:
> > @@ -3620,6 +3622,8 @@ dump_generic_node (pretty_printer *pp, tree node, int 
> > spc, dump_flags_t flags,
> >     pp_string (pp, "(D)");
> >        if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (node))
> >     pp_string (pp, "(ab)");
> > +      if (SSA_NAME_POINTS_TO_READONLY_MEMORY (node))
> > +   pp_string (pp, "(ptro)");
> >        break;
> >  
> >      case WITH_SIZE_EXPR:
> 
> > --- a/gcc/tree-ssanames.cc
> > +++ b/gcc/tree-ssanames.cc
> > @@ -402,6 +402,9 @@ make_ssa_name_fn (struct function *fn, tree var, gimple 
> > *stmt,
> >    else
> >      SSA_NAME_RANGE_INFO (t) = NULL;
> >  
> > +  if (VAR_P (var) && DECL_POINTS_TO_READONLY (var))
> > +    SSA_NAME_POINTS_TO_READONLY_MEMORY (t) = 1;
> > +
> >    SSA_NAME_IN_FREE_LIST (t) = 0;
> >    SSA_NAME_IS_DEFAULT_DEF (t) = 0;
> >    init_ssa_name_imm_use (t);
> 
> > --- a/gcc/tree.h
> > +++ b/gcc/tree.h
> > @@ -1036,6 +1036,13 @@ extern void omp_clause_range_check_failed 
> > (const_tree, const char *, int,
> >  #define DECL_HIDDEN_STRING_LENGTH(NODE) \
> >    (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
> >  
> > +/* In a VAR_DECL, set for variables regarded as pointing to memory not 
> > written
> > +   to. SSA_NAME_POINTS_TO_READONLY_MEMORY gets set for SSA_NAMEs created 
> > from
> > +   such VAR_DECLs. Currently used by OpenACC 'readonly' modifier in copyin
> > +   clauses.  */
> > +#define DECL_POINTS_TO_READONLY(NODE) \
> > +  (TREE_CHECK (NODE, VAR_DECL)->decl_common.decl_not_flexarray)
> 
> Again update the table for the flag uses are listed?
> 
> (There is a 'VAR_DECL_CHECK', which hopefully means the same thing.)
> 
> > +
> >  /* In a CALL_EXPR, means that the call is the jump from a thunk to the
> >     thunked-to function.  Be careful to avoid using this macro when one of 
> > the
> >     next two applies instead.  */
> > @@ -1845,6 +1852,10 @@ class auto_suppress_location_wrappers
> >  #define OMP_CLAUSE_MAP_READONLY(NODE) \
> >    TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> >  
> > +/* Set if 'OMP_CLAUSE_DECL (NODE)' points to read-only memory.  */
> > +#define OMP_CLAUSE_MAP_POINTS_TO_READONLY(NODE) \
> > +  TREE_CONSTANT (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP))
> > +
> >  /* Same as above, for use in OpenACC cache directives.  */
> >  #define OMP_CLAUSE__CACHE__READONLY(NODE) \
> >    TREE_READONLY (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__CACHE_))
> 
> (Note, corresponding 'OMP_CLAUSE_MAP_POINTS_TO_READONLY' doesn't exist
> yet, due to missing actual handling of the OpenACC 'cache' directive;
> 'gcc/gimplify.cc:gimplify_oacc_cache'.)
> 
> 
> Grüße
>  Thomas
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to