[llvm-bugs] [Bug 157018] [Flang][OpenMP] offload hierachical parallelism failed on NVIDIA GPU

LLVM Bugs via llvm-bugs Sat, 06 Sep 2025 21:15:41 -0700

Issue	157018
Summary	[Flang][OpenMP] offload hierachical parallelism failed on NVIDIA GPU
Labels	flang
Assignees
Reporter	ye-luo

    Reproducer code
https://github.com/TApplencourt/OvO/blob/master/test_src/fortran/hierarchical_parallelism/reduction_add-real/target_teams_distribute__parallel_do.F90


```
$ OMP_TARGET_OFFLOAD=mandatory flang -fopenmp --offload-arch=sm_90 -O3 target_teams_distribute__parallel_do.F90
nvlink warning : Stack size for entry function '__omp_offloading_2e_73c80a0c__QQmain_l20' cannot be statically determined
$ ./a.out 
"PluginInterface" error: Failure to copy data from device to host. Pointers: host = 0x00007fff77470d14, device = 0x00007f57ff600000, size = 4: "unknown or internal error" error in cuMemcpyDtoHAsync: an illegal memory access was encountered
omptarget error: Copying data from device failed.
omptarget error: Call to targetDataEnd failed, abort target.
omptarget error: Failed to process data after launching the kernel.
omptarget error: Consult https://openmp.llvm.org/design/Runtimes.html for debugging options.
omptarget error: Source location information not present. Compile with -g or -gline-tables-only.
omptarget fatal error 1: failure of target construct while offloading is mandatory
Aborted
```
The actual failure came from the kernel run. It error `an illegal memory access was encountered` got caught by the subsequent cuMemcpyDtoHAsync.

No issue offload to AMD GPU gfx90a.

_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 157018] [Flang][OpenMP] offload hierachical parallelism failed on NVIDIA GPU

Reply via email to