Issue 150918
Summary lldMain causes hang when called from forked process
Labels new issue
Assignees
Reporter PMylon
    When calling lld::lldMain() in a child process spawned via fork(), the process will hang indefinitely in pthread_cond_wait() during teardown:

<img width="1202" height="585" alt="Image" src="" />

**Root cause**: 
parallelFor uses parallel::TaskGroup::spawn which uses the DefaultExecutor (https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/Parallel.cpp#L198), and getDefaultExecutor defines a static ThreadPoolExecutor: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Support/Parallel.cpp#L165. Since it is static, it is tied to the lifetime of the parent process. When we do fork, only the calling thread of the parent process is duplicated in the child process (https://pubs.opengroup.org/onlinepubs/9699919799/functions/fork.html). Therefore, I assume that the child process waits indefinitely for other threads of the static ThreadPool to finish during cleanup, but these threads do not exist.

We experienced this issue in triton (https://github.com/triton-lang/triton), which was causing the following test to fail:
https://github.com/triton-lang/triton/blob/main/python/test/unit/runtime/test_subproc.py#L68

We implemented a workaround (https://github.com/triton-lang/triton/blob/main/third_party/amd/python/triton_amd.cc#L127), which runs lld single threaded, but we would like to discuss if it is possible to implement a better long-term solution without having to disable multithreading. Maybe if we could support passing a non-default executor to TaskGroup either via a constructor or a setter? Happy to hear and discuss any alternative solutions that might fit more with LLVM's design or require less 'intrusive' changes to the codebase.

**How to reproduce**:
You can follow the instructions in this repo [lld fork issue reproducers](https://github.com/PMylon/lld_fork_hang_reproducers), where I have included 2 reproducers (one which calls lld::lldMain and another which directly calls parallelFor) to reproduce the issue.

CC: @antiagainst @makslevental FYI
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to