Issue |
153435
|
Summary |
Race condition in Flang OpenMP target teams nested loop structure with reduction in CPU runs.
|
Labels |
flang
|
Assignees |
|
Reporter |
scamp-nvidia
|
The following target teams nested loop reproducer shows a run to run race condition with Flang:
```
program omp_snippet_reproducer
implicit none
integer, parameter :: n = 256
real :: a(3,0:n), xx1,a1,a2,checksum
integer :: i, j
a = 0.0
!$omp target
!$omp teams private(xx1,a1,a2)
!$omp loop
do i=0,n-1
a1=0.0d0
a2=0.0d0
!$omp loop private(xx1) reduction(+:a1)
do j=0,n-1
xx1=j*5/100
a1 = a1 + xx1
enddo
a(1,i) = a1
enddo
!$omp end teams
!$omp end target
! Light-weight check: print a checksum scaled to easily show changes
checksum = sum(a)-500000
! Subtract off serial answer
checksum = checksum + 112928
write(*,*) 'Serial Difference:', checksum
end program omp_snippet_reproducer
```
This code is created from an internal SPEC OMP port. It has been set up so that the serial answer is wired into the checksum print - if it's 0, then the answer is right. Otherwise it's off. Compiling this code with a recent build of Flang at no optimization and at "-O1", we get different answers each time we run it:
```
scamp:$ flang test.F90 -o test -fopenmp -v
flang version 22.0.0git (https://github.com/llvm/llvm-project 8c7e1ab98e80c4f224ba2ef7d343534afa237247)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/13
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14
Candidate multilib: .;@m64
Selected multilib: .;@m64
"flang" -fc1 -triple x86_64-unknown-linux-gnu -emit-obj -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu x86-64 -fopenmp -resource-dir clang/22 -mframe-pointer=all -o /tmp/test-9ad73e.o -x f95 test.F90
warning: loc("test.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ export OMP_NUM_THREADS=64
scamp:$ ./test ; ./test ; ./test
Serial Difference: 34776.
Serial Difference: 28728.
Serial Difference: 39312.
scamp:$ flang test.F90 -o test -fopenmp -O1
warning: loc("test.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ ./test ; ./test ; ./test
Serial Difference: 33264.
Serial Difference: 30240.
Serial Difference: 43848.
```
If I make 'OMP_NUM_THREADS' equal to 1, then I get the serial answer. Or if I go to the optimization case (-O1) and (oddly) comment out the "a2=0.0d0" line, then I get the right answer!
```
scamp:$ flang test-a2commentedout.F90 -o test -fopenmp -O1
warning: loc("test-a2commentedout.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ ./test ; ./test ; ./test
Serial Difference: 0.
Serial Difference: 0.
Serial Difference: 0.
```
But even in the a2 commented out case, the no optimization case still shows the race condition.
Interestingly, a similar zero optimization problem can also be seen in GCC (14.1.0 here) - with the no optimization case also showing a race condition, and "-O1" showing the correct answer even without having to comment out the "a2=0.0d0" line. This makes me suspect that the issue is something subtle and maybe a awkward edge case. NVHPC (25.7) shows the correct answer in all cases.
```
scamp:$ gfortran test.F90 -o test -fopenmp
scamp:$ ./test ; ./test ; ./test
Serial Difference: 287280.000
Serial Difference: 317520.000
Serial Difference: 427896.000
scamp:$ gfortran test.F90 -o test -fopenmp -O1
scamp:$ ./test ; ./test ; ./test
Serial Difference: 0.00000000
Serial Difference: 0.00000000
Serial Difference: 0.00000000
scamp:$ nvfortran test.F90 -o test -mp
scamp:$ ./test ; ./test ; ./test
Serial Difference: 0.000000
Serial Difference: 0.000000
Serial Difference: 0.000000
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs