https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120753

--- Comment #14 from Benjamin Schulz <schulz.benjamin at googlemail dot com> ---
It turns out that there are indications in the OpenMP standard for C++ which
show that the prevention of struct members was indeed deliberate.

https://www.openmp.org/spec-html/5.2/openmpsu19.html

"Unless otherwise specified, a variable that is part of another variable (as an
array element or a structure element) cannot be a variable list item, an
extended list item or locator list item except if the list appears on a clause
that is associated with a construct within a class non-static member function
and the variable is an accessible data member of the object for which the
non-static member function is invoked."


this means that i could only use variables which are class members in openmp
constructs only if I call the openmp loop within a member function of the
class...


Interestingly, the OpenAcc standard has no such restrictions. It just says for 

deviceptr:

https://www.openacc.org/sites/default/files/inline-images/Specification/OpenACC-3.3-final.pdf

1748 In C and C++, the vars in var-list must be pointer variables.

Now if I compile the following with gcc, 

#include <openacc.h>

struct mytensor
{
int *strides;
int *extents;
double *data;
};

int main()
{
mytensor t;
t.data=(double*)acc_malloc(sizeof(double)*20);


#pragma acc parallel loop gang deviceptr(t.data)
 for(int i=1;i<20;i++)
    {
        t.data[i]=20;
    }
}

I get with gcc:

t is not a pointer variable..


Well, but I gave not t but t.data to deviceptr, and surely 

double*x=t.data 

needs no conversion, so it is a pointer variable.

As a reference compiler for OpenAcc, I would take Nvidia's own nvc++ compiler
from Nvidia's hpc sdk, which you can download.

(Notably, Nvidia's compiler is  very paranoid about aliasing, but that can be
removed by Msafeptr=all option. Then it generates fast code in most situations
on GPU)


In any case, of course, using the options:

nvc++ ./main.cpp -acc -gpu=cuda12.9

nvc++ compiles this snipped that gcc refuses to work with:


#include <openacc.h>

struct mytensor
{
int *strides;
int *extents;
double *data;
};

int main()
{
mytensor t;
t.data=(double*)acc_malloc(sizeof(double)*20);


#pragma acc parallel loop gang deviceptr(t.data)
 for(int i=1;i<20;i++)
    {
        t.data[i]=20;
    }
}


As I debugged, nvc++ generates valid cuda code from this small snippet, which
gcc refuses to compile. 

Since the standard OpenACC was created by Nvidia, I would think that their
interpretation of the standard is correct. So, as t.data is a pointer variable
and gcc should be fixed to allow this at least in OpenACC.



(Actually, the forbidding of struct members in OpenMP constructs may create
performance problems in any loop over structs, not just on gpu, since it
prevents to set member variables as shared, which may induce unnecessary copies
for each thread.)

Reply via email to