https://bugs.llvm.org/show_bug.cgi?id=47039

            Bug ID: 47039
           Summary: Target followed by Atomic give incorrect result
           Product: OpenMP
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Runtime Library
          Assignee: unassignedb...@nondot.org
          Reporter: tapplenco...@anl.gov
                CC: llvm-bugs@lists.llvm.org

Created attachment 23826
  --> https://bugs.llvm.org/attachment.cgi?id=23826&action=edit
Reproducer

Found in `clang version 12.0.0 (https://github.com/llvm/llvm-project.git
55ead5bfffdc00e84cff347ee98471b5616a9f48)` running on a Ndvida V100

Hi,

When run, the following code:

```
  float counter_target{};
  #pragma omp target map(tofrom: counter_target)
  {
    #pragma omp atomic update
    counter_target = counter_target +  1. ;
  }
```
Produce 128, where we expected the result to be 1. Indeed only one thread
should be active in the target region.

LIBOMPTARGET_DEBUG=1 give us some hint from a potential root of the problem:
```
Target CUDA RTL --> Setting CUDA threads per block to default 128
Target CUDA RTL --> Using default number of teams 128
Target CUDA RTL --> Launch kernel with 128 blocks and 128 threads
```
It look like the runtime spwamned 128 threads, corresponding to 128 teams
regardless of the absence of the #teams pragma. 

Regards,
Thomas

PS: I joined a full reproducer for convenience. You can also found it here:
https://github.com/TApplencourt/OvO/blob/master/test_src/cpp/hierarchical_parallelism/atomic-float/target.cpp

-- 
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to