https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121664

Benjamin Schulz <schulz.benjamin at googlemail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schulz.benjamin@googlemail.
                   |                            |com

--- Comment #3 from Benjamin Schulz <schulz.benjamin at googlemail dot com> ---
Hi what I've also noted was that in a matrix multiplication for c++ code, if
you write the inner loop with omp simd and the two outer loops with a target
teams distribute parallel for collapse(2), then returns the wrong values.

Writing target teams distribute over the first for loop and then parallel for
over the second yields the correct results.

I reported this at:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122280

This seems to point at a problem with collapse and teams distribute. Using
parallel for in the first loop yields the correct results.

Note that clang compiles the code correctly in all cases (and I see it on
device)... I want to note that i prefer an ICE instead of getting wrong numbers
out....

I also see illegal memory accesses in this code:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122281

which is also compiled correctly by clang... Note that this code involves a
class which has a struct with member fields which were offloaded, and the class
also offloads some member variables, and the nested teams distribute and
parallel for loop contain two inner loops which are, however sequential, and
have a local goto to exit it...

Probably this is similar than what you reported above...

I don't know really, since my code is a bit large it is difficult to reduce
where the problem originates exactly. What I currently know is that clang
compiles my code...

Reply via email to