Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2024-04-12 Thread Richard Biener
On Thu, 11 Apr 2024, Thomas Schwinge wrote:

> Hi Chung-Lin, Richard!
> 
> From me just a few mechanical pieces, see below.  Richard, are you able
> to again comment on Chung-Lin's general strategy, as I'm not at all
> familiar with those parts of the code?

I've queued all stage1 material and will be only able to slowly look
at it after we released.

> On 2024-04-03T19:50:55+0800, Chung-Lin Tang  
> wrote:
> > On 2023/10/30 8:46 PM, Richard Biener wrote:
> >>>
> >>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> >>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> >>> flag.
> >>>
> >>> The actual optimization then is done in this second patch.  Chung-Lin
> >>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> >>> I don't have much experience with most of the following generic code, so
> >>> would appreciate a helping hand, whether that conceptually makes sense as
> >>> well as from the implementation point of view:
> >
> > First of all, I have removed all of the gimplify-stage scanning and setting 
> > of
> > DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no 
> > changes to
> > gimplify.cc now)
> >
> > I remember this code was an artifact of earlier attempts to allow 
> > struct-member
> > pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed 
> > anyways.
> > I think the omp_data_* member accesses when building child function side
> > receiver_refs is blocking points-to analysis from working (didn't try 
> > digging deeper)
> >
> > Also during gimplify, VAR_DECLs appeared to be reused (at least in some 
> > cases) for map
> > clause decl reference building, so hoping that the variables "happen to be" 
> > single-use and
> > DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY 
> > does appear to be
> > a little risky.
> >
> > However, for firstprivate pointers processed during omp-low, it appears to 
> > be somewhat different.
> > (see below description)
> >
> >> No, I don't think you can use that flag on non-default-defs, nor
> >> preserve it on copying.  So
> >> it also doesn't nicely extend to DECLs as done by the patch.  We
> >> currently _only_ use it
> >> for incoming parameters.  When used on arbitrary code you can get to for 
> >> example
> >> 
> >> ptr1(points-to-readony-memory) = >x;
> >> ... access via ptr1 ...
> >> ptr2 = >x;
> >> ... access via ptr2 ...
> >> 
> >> where both are your OMP regions differently constrained (the constrain is 
> >> on the
> >> code in the region, _not_ on the actual protections of the pointed to
> >> data, much like
> >> for the fortran case).  But now CSE comes along and happily replaces all 
> >> ptr2
> >> with ptr2 in the second region and ... oops!
> >
> > Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 
> > in the second region"?
> >
> > That doesn't happen, because during omp-lower/expand, OMP target regions 
> > (which is all that
> > this applies currently) is separated into different individual child 
> > functions.
> >
> > (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during 
> > omp-lower, when
> > for firstprivate pointers (i.e. 'a' here) we set this bit when constructing 
> > the first load
> > of this pointer)
> >
> >   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> >   {
> > foo (a, a[8]);
> > r = a[8];
> >   }
> >   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
> >   {
> > foo (a, a[12]);
> > r = a[12];
> >   }
> >
> > After omp-expand (before SSA):
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> > {
> >  ...
> >:
> >   D.2962 = .omp_data_i->D.2947;
> >   a.8 = D.2962;
> >   r.1 = (*a.8)[12];
> >   foo (a.8, r.1);
> >   r.1 = (*a.8)[12];
> >   D.2965 = .omp_data_i->r;
> >   *D.2965 = r.1;
> >   return;
> > }
> >
> > __attribute__((oacc parallel, omp target entrypoint, noclone))
> > void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> > {
> >   ...
> >:
> >   D.2968 = .omp_data_i->D.2939;
> >   a.4 = D.2968;
> >   r.0 = (*a.4)[8];
> >   foo (a.4, r.0);
> >   r.0 = (*a.4)[8];
> >   D.2971 = .omp_data_i->r;
> >   *D.2971 = r.0;
> >   return;
> > }
> >
> > So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> > SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a 
> > default-def
> > for an PARM_DECL, at least conceptually.
> >
> > (If offloading was structured significantly differently, say if child 
> > functions
> > were separated much earlier before omp-lowering, than this 
> > readonly-modifier might
> > possibly be a direct application of 'r' in the "fn spec" attribute)
> >
> > Other changes since first version of patch include:
> > 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> > 2) merging of two if cases in fortran/trans-openmp.cc like 

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2024-04-11 Thread Thomas Schwinge
Hi Chung-Lin, Richard!

>From me just a few mechanical pieces, see below.  Richard, are you able
to again comment on Chung-Lin's general strategy, as I'm not at all
familiar with those parts of the code?

On 2024-04-03T19:50:55+0800, Chung-Lin Tang  wrote:
> On 2023/10/30 8:46 PM, Richard Biener wrote:
>>>
>>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
>>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
>>> flag.
>>>
>>> The actual optimization then is done in this second patch.  Chung-Lin
>>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
>>> I don't have much experience with most of the following generic code, so
>>> would appreciate a helping hand, whether that conceptually makes sense as
>>> well as from the implementation point of view:
>
> First of all, I have removed all of the gimplify-stage scanning and setting of
> DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes 
> to
> gimplify.cc now)
>
> I remember this code was an artifact of earlier attempts to allow 
> struct-member
> pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed 
> anyways.
> I think the omp_data_* member accesses when building child function side
> receiver_refs is blocking points-to analysis from working (didn't try digging 
> deeper)
>
> Also during gimplify, VAR_DECLs appeared to be reused (at least in some 
> cases) for map
> clause decl reference building, so hoping that the variables "happen to be" 
> single-use and
> DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does 
> appear to be
> a little risky.
>
> However, for firstprivate pointers processed during omp-low, it appears to be 
> somewhat different.
> (see below description)
>
>> No, I don't think you can use that flag on non-default-defs, nor
>> preserve it on copying.  So
>> it also doesn't nicely extend to DECLs as done by the patch.  We
>> currently _only_ use it
>> for incoming parameters.  When used on arbitrary code you can get to for 
>> example
>> 
>> ptr1(points-to-readony-memory) = >x;
>> ... access via ptr1 ...
>> ptr2 = >x;
>> ... access via ptr2 ...
>> 
>> where both are your OMP regions differently constrained (the constrain is on 
>> the
>> code in the region, _not_ on the actual protections of the pointed to
>> data, much like
>> for the fortran case).  But now CSE comes along and happily replaces all ptr2
>> with ptr2 in the second region and ... oops!
>
> Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in 
> the second region"?
>
> That doesn't happen, because during omp-lower/expand, OMP target regions 
> (which is all that
> this applies currently) is separated into different individual child 
> functions.
>
> (Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during 
> omp-lower, when
> for firstprivate pointers (i.e. 'a' here) we set this bit when constructing 
> the first load
> of this pointer)
>
>   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
>   {
> foo (a, a[8]);
> r = a[8];
>   }
>   #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
>   {
> foo (a, a[12]);
> r = a[12];
>   }
>
> After omp-expand (before SSA):
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
> {
>  ...
>:
>   D.2962 = .omp_data_i->D.2947;
>   a.8 = D.2962;
>   r.1 = (*a.8)[12];
>   foo (a.8, r.1);
>   r.1 = (*a.8)[12];
>   D.2965 = .omp_data_i->r;
>   *D.2965 = r.1;
>   return;
> }
>
> __attribute__((oacc parallel, omp target entrypoint, noclone))
> void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
> {
>   ...
>:
>   D.2968 = .omp_data_i->D.2939;
>   a.4 = D.2968;
>   r.0 = (*a.4)[8];
>   foo (a.4, r.0);
>   r.0 = (*a.4)[8];
>   D.2971 = .omp_data_i->r;
>   *D.2971 = r.0;
>   return;
> }
>
> So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
> SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a 
> default-def
> for an PARM_DECL, at least conceptually.
>
> (If offloading was structured significantly differently, say if child 
> functions
> were separated much earlier before omp-lowering, than this readonly-modifier 
> might
> possibly be a direct application of 'r' in the "fn spec" attribute)
>
> Other changes since first version of patch include:
> 1) update of C/C++ FE changes to new style in c-family/c-omp.cc
> 2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
> 3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to 
> verify removal of a MEM load, also as Thomas suggested.

Thanks!

> I have re-tested this patch using mainline, with no regressions. Is this okay 
> for mainline?

> 2024-04-03  Chung-Lin Tang  
>
> gcc/c-family/ChangeLog:
>
>   * c-omp.cc (c_omp_address_inspector::expand_array_base):
>   Set 

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2024-04-03 Thread Chung-Lin Tang
Hi Richard, Thomas,

On 2023/10/30 8:46 PM, Richard Biener wrote:
>>
>> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
>> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
>> flag.
>>
>> The actual optimization then is done in this second patch.  Chung-Lin
>> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
>> I don't have much experience with most of the following generic code, so
>> would appreciate a helping hand, whether that conceptually makes sense as
>> well as from the implementation point of view:

First of all, I have removed all of the gimplify-stage scanning and setting of
DECL_POINTS_TO_READONLY and SSA_NAME_POINTS_TO_READONLY_MEMORY (so no changes to
gimplify.cc now)

I remember this code was an artifact of earlier attempts to allow struct-member
pointer mappings to also work (e.g. map(readonly:rec.ptr[:N])), but failed 
anyways.
I think the omp_data_* member accesses when building child function side
receiver_refs is blocking points-to analysis from working (didn't try digging 
deeper)

Also during gimplify, VAR_DECLs appeared to be reused (at least in some cases) 
for map
clause decl reference building, so hoping that the variables "happen to be" 
single-use and
DECL_POINTS_TO_READONLY relaying into SSA_NAME_POINTS_TO_READONLY_MEMORY does 
appear to be
a little risky.

However, for firstprivate pointers processed during omp-low, it appears to be 
somewhat different.
(see below description)

> No, I don't think you can use that flag on non-default-defs, nor
> preserve it on copying.  So
> it also doesn't nicely extend to DECLs as done by the patch.  We
> currently _only_ use it
> for incoming parameters.  When used on arbitrary code you can get to for 
> example
> 
> ptr1(points-to-readony-memory) = >x;
> ... access via ptr1 ...
> ptr2 = >x;
> ... access via ptr2 ...
> 
> where both are your OMP regions differently constrained (the constrain is on 
> the
> code in the region, _not_ on the actual protections of the pointed to
> data, much like
> for the fortran case).  But now CSE comes along and happily replaces all ptr2
> with ptr2 in the second region and ... oops!

Richard, I assume what you meant was "happily replaces all ptr2 with ptr1 in 
the second region"?

That doesn't happen, because during omp-lower/expand, OMP target regions (which 
is all that
this applies currently) is separated into different individual child functions.

(Currently, the only "effective" use of DECL_POINTS_TO_READONLY is during 
omp-lower, when
for firstprivate pointers (i.e. 'a' here) we set this bit when constructing the 
first load
of this pointer)

  #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
  {
foo (a, a[8]);
r = a[8];
  }
  #pragma acc parallel copyin(readonly: a[:32]) copyout(r)
  {
foo (a, a[12]);
r = a[12];
  }

After omp-expand (before SSA):

__attribute__((oacc parallel, omp target entrypoint, noclone))
void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i)
{
 ...
   :
  D.2962 = .omp_data_i->D.2947;
  a.8 = D.2962;
  r.1 = (*a.8)[12];
  foo (a.8, r.1);
  r.1 = (*a.8)[12];
  D.2965 = .omp_data_i->r;
  *D.2965 = r.1;
  return;
}

__attribute__((oacc parallel, omp target entrypoint, noclone))
void main._omp_fn.0 (const struct .omp_data_t.2 & restrict .omp_data_i)
{
  ...
   :
  D.2968 = .omp_data_i->D.2939;
  a.4 = D.2968;
  r.0 = (*a.4)[8];
  foo (a.4, r.0);
  r.0 = (*a.4)[8];
  D.2971 = .omp_data_i->r;
  *D.2971 = r.0;
  return;
}

So actually, the creating of DECL_POINTS_TO_READONLY and its relaying to
SSA_NAME_POINTS_TO_READONLY_MEMORY here, is actually quite similar to a 
default-def
for an PARM_DECL, at least conceptually.

(If offloading was structured significantly differently, say if child functions
were separated much earlier before omp-lowering, than this readonly-modifier 
might
possibly be a direct application of 'r' in the "fn spec" attribute)

Other changes since first version of patch include:
1) update of C/C++ FE changes to new style in c-family/c-omp.cc
2) merging of two if cases in fortran/trans-openmp.cc like Thomas suggested
3) Update of readonly-2.c testcase to scan before/after "fre1" pass, to verify 
removal of a MEM load, also as Thomas suggested.

I have re-tested this patch using mainline, with no regressions. Is this okay 
for mainline?

Thanks,
Chung-Lin

2024-04-03  Chung-Lin Tang  

gcc/c-family/ChangeLog:

* c-omp.cc (c_omp_address_inspector::expand_array_base):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
(c_omp_address_inspector::expand_component_selector): Likewise.

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_array_section):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.

gcc/ChangeLog:

* gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
for VAR_DECLs.
* omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
variables of 

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2023-10-30 Thread Richard Biener
On Fri, Oct 27, 2023 at 4:28 PM Thomas Schwinge  wrote:
>
> Hi!
>
> Richard, as the original author of 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
> 2018 commit 6214d5c7e7470bdd5ecbeae668c2522551bfebbc (Subversion r263958)
> "Move const_parm trick to generic code"; 'gcc/tree.h':
>
> /* Nonzero if this SSA_NAME is known to point to memory that may not
>be written to.  This is set for default defs of function parameters
>that have a corresponding r or R specification in the functions
>fn spec attribute.  This is used by alias analysis.  */
> #define SSA_NAME_POINTS_TO_READONLY_MEMORY(NODE) \
> SSA_NAME_CHECK (NODE)->base.deprecated_flag
>
> ..., may I ask you to please help review the following patch
> (full-quoted)?
>
> For context: this patch here ("second patch") depends on a first patch:
> 
> "[PATCH, OpenACC 2.7] readonly modifier support in front-ends".  That one
> is still under review/rework; so you're not able to apply this second
> patch here.
>
> In a nutshell: a 'readonly' modifier has been added to the OpenACC
> 'copyin' clause (copy host to device memory, don't copy back at end of
> region):
>
> | If the optional 'readonly' modifier appears, then the implementation may 
> assume that the data
> | referenced by _var-list_ is never written to within the applicable region.
>
> That is, for example (untested):
>
> #pragma acc routine
> void escape(int *);
>
> int x[32] = [...];
> #pragma acc parallel copyin(readonly: x)
> {
>   int a1 = x[3];
>   escape(x);
>   int a2 = x[3]; // Per 'readonly', don't need to reload 'x[3]' here.
>   //x[22] = 0; // Invalid -- but no diagnostic mandated.
> }
>
> What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
> 'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
> flag.
>
> The actual optimization then is done in this second patch.  Chung-Lin
> found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
> I don't have much experience with most of the following generic code, so
> would appreciate a helping hand, whether that conceptually makes sense as
> well as from the implementation point of view:

No, I don't think you can use that flag on non-default-defs, nor
preserve it on copying.  So
it also doesn't nicely extend to DECLs as done by the patch.  We
currently _only_ use it
for incoming parameters.  When used on arbitrary code you can get to for example

ptr1(points-to-readony-memory) = >x;
... access via ptr1 ...
ptr2 = >x;
... access via ptr2 ...

where both are your OMP regions differently constrained (the constrain is on the
code in the region, _not_ on the actual protections of the pointed to
data, much like
for the fortran case).  But now CSE comes along and happily replaces all ptr2
with ptr2 in the second region and ... oops!

So no, re-using SSA_NAME_POINTS_TO_READONLY_MEMORY doesn't look good.

Richard.

> On 2023-07-25T23:52:06+0800, Chung-Lin Tang via Gcc-patches 
>  wrote:
> > On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
> >> As we discussed earlier, the work for actually linking this to middle-end
> >> points-to analysis is a somewhat non-trivial issue. This first patch allows
> >> the language feature to be used in OpenACC directives first (with no 
> >> effect for now).
> >> The middle-end changes are probably going to be a later patch.
> >
> > This second patch tries to link the readonly modifier to points-to analysis.
> >
> > There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in 
> > the
> > alias oracle routines in tree-ssa-alias.cc, so basically what this patch 
> > does is
> > try to make the variables holding the array section base pointers to have 
> > this
> > flag set.
> >
> > There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on 
> > the
> > associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
> > Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the 
> > tmp
> > vars carrying these receiver references on the offloaded side. These
> > eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.
>
>
> > This still doesn't always work as expected in terms of optimization:
> > struct pointer fields and Fortran arrays (kind of like C structs) which have
> > several accesses to create the pointer access on the receive/offloaded side,
> > and SRA appears to not work on these sequences, so gets in the way of much
> > redundancy elimination.
>
> I understand correctly that this is left as future work?  Please add the test
> cases you have, XFAILed in some reasonable way.
>
>
> > Currently have one testcase where we can demonstrate 'readonly' can avoid
> > a clobber by function call.
>
> :-)
>
>
> > --- a/gcc/c/c-typeck.cc
> > +++ b/gcc/c/c-typeck.cc
> > @@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum 
> > c_omp_region_type ort)
> >   OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
> >else
> >   

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2023-10-27 Thread Thomas Schwinge
Hi!

Richard, as the original author of 'SSA_NAME_POINTS_TO_READONLY_MEMORY':
2018 commit 6214d5c7e7470bdd5ecbeae668c2522551bfebbc (Subversion r263958)
"Move const_parm trick to generic code"; 'gcc/tree.h':

/* Nonzero if this SSA_NAME is known to point to memory that may not
   be written to.  This is set for default defs of function parameters
   that have a corresponding r or R specification in the functions
   fn spec attribute.  This is used by alias analysis.  */
#define SSA_NAME_POINTS_TO_READONLY_MEMORY(NODE) \
SSA_NAME_CHECK (NODE)->base.deprecated_flag

..., may I ask you to please help review the following patch
(full-quoted)?

For context: this patch here ("second patch") depends on a first patch:

"[PATCH, OpenACC 2.7] readonly modifier support in front-ends".  That one
is still under review/rework; so you're not able to apply this second
patch here.

In a nutshell: a 'readonly' modifier has been added to the OpenACC
'copyin' clause (copy host to device memory, don't copy back at end of
region):

| If the optional 'readonly' modifier appears, then the implementation may 
assume that the data
| referenced by _var-list_ is never written to within the applicable region.

That is, for example (untested):

#pragma acc routine
void escape(int *);

int x[32] = [...];
#pragma acc parallel copyin(readonly: x)
{
  int a1 = x[3];
  escape(x);
  int a2 = x[3]; // Per 'readonly', don't need to reload 'x[3]' here.
  //x[22] = 0; // Invalid -- but no diagnostic mandated.
}

What Chung-Lin's first patch does is mark the OMP clause for 'x' (not the
'x' decl itself!) as 'readonly', via a new 'OMP_CLAUSE_MAP_READONLY'
flag.

The actual optimization then is done in this second patch.  Chung-Lin
found that he could use 'SSA_NAME_POINTS_TO_READONLY_MEMORY' for that.
I don't have much experience with most of the following generic code, so
would appreciate a helping hand, whether that conceptually makes sense as
well as from the implementation point of view:

On 2023-07-25T23:52:06+0800, Chung-Lin Tang via Gcc-patches 
 wrote:
> On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
>> As we discussed earlier, the work for actually linking this to middle-end
>> points-to analysis is a somewhat non-trivial issue. This first patch allows
>> the language feature to be used in OpenACC directives first (with no effect 
>> for now).
>> The middle-end changes are probably going to be a later patch.
>
> This second patch tries to link the readonly modifier to points-to analysis.
>
> There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in 
> the
> alias oracle routines in tree-ssa-alias.cc, so basically what this patch does 
> is
> try to make the variables holding the array section base pointers to have this
> flag set.
>
> There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on the
> associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
> Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the tmp
> vars carrying these receiver references on the offloaded side. These
> eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.


> This still doesn't always work as expected in terms of optimization:
> struct pointer fields and Fortran arrays (kind of like C structs) which have
> several accesses to create the pointer access on the receive/offloaded side,
> and SRA appears to not work on these sequences, so gets in the way of much
> redundancy elimination.

I understand correctly that this is left as future work?  Please add the test
cases you have, XFAILed in some reasonable way.


> Currently have one testcase where we can demonstrate 'readonly' can avoid
> a clobber by function call.

:-)


> --- a/gcc/c/c-typeck.cc
> +++ b/gcc/c/c-typeck.cc
> @@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum 
> c_omp_region_type ort)
>   OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
>else
>   OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> +  if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
>OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
>if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> && !c_mark_addressable (t))

> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum 
> c_omp_region_type ort)
>   }
> else
>   OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
> +   if (OMP_CLAUSE_MAP_READONLY (c))
> + OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
> OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
> if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
> && !cxx_mark_addressable (t))

> --- a/gcc/fortran/trans-openmp.cc
> +++ b/gcc/fortran/trans-openmp.cc
> @@ -2524,6 +2524,8 @@ 

[PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2023-07-25 Thread Chung-Lin Tang via Gcc-patches
On 2023/7/11 2:33 AM, Chung-Lin Tang via Gcc-patches wrote:
> As we discussed earlier, the work for actually linking this to middle-end
> points-to analysis is a somewhat non-trivial issue. This first patch allows
> the language feature to be used in OpenACC directives first (with no effect 
> for now).
> The middle-end changes are probably going to be a later patch.

This second patch tries to link the readonly modifier to points-to analysis.

There already exists SSA_NAME_POINTS_TO_READONLY_MEMORY and it's support in the
alias oracle routines in tree-ssa-alias.cc, so basically what this patch does is
try to make the variables holding the array section base pointers to have this
flag set.

There is an another OMP_CLAUSE_MAP_POINTS_TO_READONLY set by front-ends on the
associated pointer clauses if OMP_CLAUSE_MAP_READONLY is set.
Also a DECL_POINTS_TO_READONLY flag is set for VAR_DECLs when creating the tmp
vars carrying these receiver references on the offloaded side. These
eventually get translated to SSA_NAME_POINTS_TO_READONLY_MEMORY.

This still doesn't always work as expected in terms of optimization:
struct pointer fields and Fortran arrays (kind of like C structs) which have
several accesses to create the pointer access on the receive/offloaded side,
and SRA appears to not work on these sequences, so gets in the way of much
redundancy elimination.

Currently have one testcase where we can demonstrate 'readonly' can avoid
a clobber by function call. Tested on powerpc64le-linux/nvptx.

Note this patch is create a-top of the front-end patch.
(will respond to the other front-end patch comments later)

Thanks,
Chung-Lin

2023-07-25  Chung-Lin Tang  

gcc/c/ChangeLog:

* c-typeck.cc (handle_omp_array_sections):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.

gcc/cp/ChangeLog:

* semantics.cc (handle_omp_array_sections):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.

gcc/fortran/ChangeLog:

* trans-openmp.cc (gfc_trans_omp_array_section):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.

gcc/ChangeLog:

* gimple-expr.cc (copy_var_decl): Copy DECL_POINTS_TO_READONLY
for VAR_DECLs.
* gimplify.cc (struct gimplify_omp_ctx):
Add 'hash_set *pt_readonly_ptrs' field.
(internal_get_tmp_var): Set
DECL_POINTS_TO_READONLY/SSA_NAME_POINTS_TO_READONLY_MEMORY for
new temp vars.
(build_omp_struct_comp_nodes):
Set OMP_CLAUSE_MAP_POINTS_TO_READONLY on pointer clause.
(gimplify_scan_omp_clauses): Collect OMP_CLAUSE_MAP_POINTS_TO_READONLY
to ctx->pt_readonly_ptrs.
* omp-low.cc (lower_omp_target): Set DECL_POINTS_TO_READONLY for
variables of receiver refs.
* tree-pretty-print.cc (dump_omp_clause):
Print OMP_CLAUSE_MAP_POINTS_TO_READONLY.
(dump_generic_node): Print SSA_NAME_POINTS_TO_READONLY_MEMORY.
* tree.h (DECL_POINTS_TO_READONLY): New macro.
(OMP_CLAUSE_MAP_POINTS_TO_READONLY): New macro.

gcc/testsuite/ChangeLog:

* c-c++-common/goacc/readonly-1.c: Adjust testcase.
* c-c++-common/goacc/readonly-2.c: New testcase.
* gfortran.dg/goacc/readonly-1.f90: Adjust testcase.
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 7cf411155c6..42591e4029a 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -14258,6 +14258,8 @@ handle_omp_array_sections (tree c, enum 
c_omp_region_type ort)
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_ATTACH_DETACH);
   else
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
+  if (OMP_CLAUSE_MAP_READONLY (c))
+   OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
   OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
   if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
  && !c_mark_addressable (t))
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 8fb47fd179e..6ab467e1140 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -5872,6 +5872,8 @@ handle_omp_array_sections (tree c, enum c_omp_region_type 
ort)
}
  else
OMP_CLAUSE_SET_MAP_KIND (c2, GOMP_MAP_FIRSTPRIVATE_POINTER);
+ if (OMP_CLAUSE_MAP_READONLY (c))
+   OMP_CLAUSE_MAP_POINTS_TO_READONLY (c2) = 1;
  OMP_CLAUSE_MAP_IMPLICIT (c2) = OMP_CLAUSE_MAP_IMPLICIT (c);
  if (OMP_CLAUSE_MAP_KIND (c2) != GOMP_MAP_FIRSTPRIVATE_POINTER
  && !cxx_mark_addressable (t))
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 2253d559f9c..d7cd65af1bb 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2524,6 +2524,8 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
gfc_exec_op op,
   node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
   OMP_CLAUSE_SET_MAP_KIND (node3, ptr_kind);
   OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl);
+  if (n->u.readonly)
+