https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100321
Bug ID: 100321
Summary: [OpenMP][nvptx] (Con't) Reduction fails with
optimization and 'loop'/'for simd' but not with 'for'
Product: gcc
Version: 11.0
Status: UNCONFIRMED
Keywords: openmp, wrong-code
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: burnus at gcc dot gnu.org
CC: vries at gcc dot gnu.org
Target Milestone: ---
Target: nvptx-none
Created attachment 50703
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50703&action=edit
target_parallel_for_simd.cpp - compile with g++ -fopenmp -O1 (and nvptx
offloading)
Similar to PR target/100232
I had hoped that the posted patch does solves this issue as well, but it does
not :-/
[ https://gcc.gnu.org/pipermail/gcc-patches/2021-April/569038.html ]
(However, it does solve the two sollve_vv issue, I mentioned in PR100232 :-)
Thanks!)
Namely, https://github.com/TApplencourt/OvO 's
test_src/cpp/hierarchical_parallelism/reduction_add-complex_double/target_parallel_for_simd.cpp
(also attached) works on the host and AMD GCN, but with nvptx:
g++ -fopenmp -O1 target_parallel_for_simd.cpp -foffload=-latomic
it fails as
Expected: (32768,0) Got: (1024,0)
(with exist status code 112)
The -O1 is needed due to the missing .alias.
When removing the 'simd' from
#pragma omp target parallel for simd map(tofrom: counter_N0) reduction(+:
counter_N0)
it does work.