Issue 69318
Summary [OMPT] Runtime can segfault during at `ompt_start_tool` when offloading and calling `std::` functions
Labels new issue
Assignees
Reporter Thyre
    ## Description

This issue is a bit hard for me to understand, but I'll try to describe it as best as I can. 

While investigating how our HIP & OMPT adapters interact when one uses OpenMP target code, I ran into the issue that the program would just end in a segmentation fault out of nowhere. Upon further investigation, I opened two issues for both `rocm_smi_lib` and `HIP`, which both showed unexpected behavior when functions of their interface were being called during `ompt_start_tool`.

Here are the corresponding issues:
- https://github.com/ROCm-Developer-Tools/HIP/issues/3330
- https://github.com/RadeonOpenCompute/rocm_smi_lib/issues/129

Upon further investigation, this seems to affect LLVM in general as well. Just using `std::cout` will cause the program to crash if called during `ompt_start_tool` or the initialization function. Trying to do the same with offloading disabled, everything works as expected.

In the HIP issue, I took a look at the order of the libraries being loaded. Here, I noticed differences when the OMPT interface is initialized. For host-only OpenMP, OMPT seems to be initialized before the first OpenMP directive / function. For host & target OpenMP, the initialization seems to occur upon loading the shared libraries, which is before entering `main`. _Maybe_ this has something to do with the issue, but I'm just guessing.

## Reproducer

One can use this small code to reproduce the issue

```cpp
#include <omp-tools.h>
#include <iostream>

int ompt_initialize(ompt_function_lookup_t lookup, int initial_device_num, ompt_data_t *tool_data) {
    return 1; //success
}

void ompt_finalize(ompt_data_t *tool_data) {}

#ifdef __cplusplus
extern "C" {
#endif
ompt_start_tool_result_t *ompt_start_tool(unsigned int omp_version, const char *runtime_version) {
    static ompt_start_tool_result_t ompt_start_tool_result = {&ompt_initialize, &ompt_finalize, 0};

    std::cout << "Hello World!" << std::endl;

    return &ompt_start_tool_result;
}
#ifdef __cplusplus
}
#endif

int main(void)
{
    #pragma omp parallel
    {}
    return 0;
}
```

Compiling with a recent trunk version of LLVM, we can see the following behavior:

```console
$ clang++ --version
clang version 18.0.0 (https://github.com/llvm/llvm-project.git e483673246bdee06e54ec06fd04236bc9fee7f63)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/software/software/LLVM/git/bin
$ clang++ -fopenmp --offload-arch=sm_70 -gdwarf-4 reproducer.cpp
$ gdb ./a.out 
[...]
(gdb) run
Starting program: /home/jreuter/tmp/Error/ompt_cpp_init/a.out 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7d3bf8a in std::ostream::sentry::sentry(std::ostream&) () from /lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0  0x00007ffff7d3bf8a in std::ostream::sentry::sentry(std::ostream&) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#1  0x00007ffff7d3ca0c in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x00007ffff7d3cebb in std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*) () from /lib/x86_64-linux-gnu/libstdc++.so.6
#3 0x0000555555555252 in ompt_start_tool (omp_version=201611, 
 runtime_version=0x7ffff7bd2c66 <__kmp_version_lib_ver+6> "LLVM OMP version: 5.0.20140926") at reproducer.cpp:16
#4  0x00007ffff7bc3e13 in ompt_pre_init () from /opt/software/software/LLVM/git/lib/libomp.so
#5 0x00007ffff7b3cb8d in __kmp_do_serial_initialize() () from /opt/software/software/LLVM/git/lib/libomp.so
#6  0x00007ffff7b47c8c in __kmp_serial_initialize () from /opt/software/software/LLVM/git/lib/libomp.so
#7  0x00007ffff7bc4a1f in ompt_libomp_connect () from /opt/software/software/LLVM/git/lib/libomp.so
#8  0x00007ffff7e9aeb1 in llvm::omp::target::ompt::connectLibrary() () from /opt/software/software/LLVM/git/lib/libomptarget.so.18git
#9 0x00007ffff7e9bc95 in init() () from /opt/software/software/LLVM/git/lib/libomptarget.so.18git
#10 0x00007ffff7fc947e in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffd2c8, env=env@entry=0x7fffffffd2d8)
    at ./elf/dl-init.c:70
#11 0x00007ffff7fc9568 in call_init (env=0x7fffffffd2d8, argv=0x7fffffffd2c8, argc=1, l=<optimized out>) at ./elf/dl-init.c:33
#12 _dl_init (main_map=0x7ffff7ffe2e0, argc=1, argv=0x7fffffffd2c8, env=0x7fffffffd2d8) at ./elf/dl-init.c:117
#13 0x00007ffff7fe32ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#14 0x0000000000000001 in ?? ()
#15 0x00007fffffffd70f in ?? ()
#16 0x0000000000000000 in ?? ()
```

Without offloading enabled, everything works:

```console
$ clang++ -fopenmp -gdwarf-4 reproducer.cpp
$ ./a.out
Hello World!
```
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to