Issue 56453
Summary OpenMP 5.0 target offloading: omp_alloc() undefined in nvlink step
Labels
Assignees
Reporter vincentadamthefirst
    I am currently trying to allocate memory on a target device in OpenMP offloading. For this I use a recently built Clang-15 and CUDA 11.6 (bundled with nvhpc 22.3).

A simple test program:

```c++
#include <omp.h>

int main() {
#pragma omp target teams distribute
    for (int index = 0; index < 100; index++) {
        float *shm = (float *) omp_alloc(20 * 20 * sizeof(float), omp_pteam_mem_alloc);

#pragma omp parallel num_threads(20 * 20) shared(shm) default(none)
        {
            int threadNum = omp_get_thread_num();
            shm[threadNum] = threadNum;
#pragma omp barrier
            // some work on shared memory & write-back
        }
    }
}
```

Compiled using
```bash
clang++ -O3 -std=c++17 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-version=51 --cuda-path=/opt/nvidia/hpc_sdk/Linux_x86_64/22.3/cuda/11.6/ main.cpp
```

I expect a successful compilation but I get

```console
nvlink error   : Undefined reference to 'omp_alloc' in '/tmp/test-008605-nvptx64-nvidia-cuda-sm_80-400196.cubin'
/usr/local/bin/clang-linker-wrapper: error: 'nvlink' failed
clang-15: error: linker command failed with exit code 1 (use -v to see invocation)
```

when it comes to the linking step. Am I missing something crucial? Is `omp_alloc()` not fully supported yet? (According to [the docs](https://clang.llvm.org/docs/OpenMPSupport.html#openmp-5-0-implementation-details) it should be). I have no problems when using the directives that more or less work the same (`#pragma omp allocate(shm) allocator(omp_pteam_mem_alloc)`)...
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to