https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123750
--- Comment #2 from Benjamin Schulz <schulz.benjamin at googlemail dot com> ---
Hi i can now reduce the problem of sparsetests to some sort of an easier test
case...
The problem with this long statement:
#pragma omp target teams distribute parallel for collapse(2) map(tofrom:count)
shared(pooled_offsets_starts,pooled_offsets_flat)
is_device_ptr(pd,pooled_offsets_flat,pooled_offsets_starts) device(devnum)
occured because the pragma was written within a member function of a class and
pooled_offsets_starts, pooled_offsets_flat were member fields of the class..
A simpler test case is:
#include <omp.h>
using namespace std;
class DataBlock1
{
public:
size_t* myshareddata1;
size_t* myshareddata2;
void func(int dev)
{
myshareddata1=(size_t*) omp_target_alloc(sizeof(size_t)*100,dev);
myshareddata2=(size_t*) omp_target_alloc(sizeof(size_t)*100,dev);
#pragma omp target teams distribute parallel for shared(myshareddata1)
is_device_ptr(myshareddata2) device(dev)
for (size_t i=0;i<100;i++)
{
myshareddata1[i]=1;
myshareddata2[i]=2;
}
}
};
int main(int argc, char** argv)
{
int dev=omp_get_default_device();
DataBlock1 A1;
A1.func(dev);
return 0;
}
which leads to:
/home/benni/projects/openmptestnew/openmpoffloatest/main.cpp:18:12: internal
compiler error: in gimplify_var_or_parm_decl, at gimplify.cc:3426
18 | #pragma omp target teams distribute parallel for
shared(myshareddata1) is_device_ptr(myshareddata2)device(dev)
| ^~~
/usr/libexec/
the problem in this case goes away, if one removes the shared clause.
openmp shares variables declared outside of the openmp pragma scope per default
implicitely. So, omitting the shared clause here has should have same effect as
using it, but instead one gets a gcc internal compiler error...
One can also evade the problem by turning the shared variable it into a purely
local variable:
size_t *myshareddata3=myshareddata1;
#pragma omp target teams distribute parallel for shared(myshareddata3)
is_device_ptr(myshareddata2)device(dev)
But if i have to turn every member field of a c++ class into a local variable
before using it into an openmp clause, then i am basically using pure C.
In the test example, is_device_ptr gives apparently no problems, but it leads
to the cu_synchronize error if used in the code of the first post, also just
because it is a member variable.
Since a pointer is just a variable containing an address as its value, and
since openmp maps the values implicitely, the code from the first post would
work by simply removing the is_device_ptr. Openmp will then use the array
correctly, which leads to the question what is_device_ptr is actually for when
it understands pointers allocated on device by omp_target_alloc without that
clause..
As for the problems of mathdemonstrations in the first post, I am not sure yet
why they occur. It seemed to have something to do with the device mapping that
was done twice for several variables in the qr decomposition and the algorithms
using mpi. Perhaps there was an error on my side, but I am not sure. After
changing my code to avoid double mappings, it works now... which is preferable
anyway...