lhames wrote:

I think the analysis is mostly right, but the fix doesn't work in general.

The general problem is that some global-lifetime objects used for compilation 
(especially function-local statics) may be initialised after the JIT global, 
and will consequently be destroyed before it. The specific crash I see is:

```c++
* thread #1, queue = 'com.apple.main-thread', stop reason = libc++: 
/Library/Developer/CommandLineTools/SDKs/MacOSX26.5.sdk/usr/include/c++/v1/__vector/vector.h:406:
 libc++ Hardening assertion __n < size() failed: vector[] index out of bounds

    frame #2: 0x0000000108ecfdb4 
ClangReplInterpreterTests`llvm::SDNode::getValueTypeList(VT=(SimpleTy = i64)) 
at SelectionDAG.cpp:13858:11
   13855          static EVTArray SimpleVTArray;
   13856
   13857          assert(VT < MVT::VALUETYPE_SIZE && "Value type out of 
range!");
-> 13858          return &SimpleVTArray.VTs[VT.SimpleTy];
   13859        }
```

In this case the relevant sequence of construction / destruction was:
1. A globally scoped `LLJIT` object gets constructed.
2. `SimpleVTArray` gets constructed by the first call to 
`llvm::SDNode::getValueTypeList` during compilation of some code.
3. `std::exit` gets called, running destructors in reverse registration order.
3. `SimpleVTArray` gets destroyed.
4. `LLJIT` destruction triggers deinitialization of JIT'd code, which triggers 
compilation of the deinitializer symbols, which calls (though many layers) down 
to `llvm::SDNode::getValueTypeList`, which uses the already-destroyed 
`SimpleVTArray`.

Forcing lookups can't fix this in general, since lazy compilation may delay 
compilation until execution (in this case destruction) time.

There are three rigorous, general-purpose solutions:

1. Avoid any global JIT objects. _This is best practice, and should be followed 
wherever possible._
2. Remove all lazy initialization from LLVM. This would fix the issue, but 
would be a lot of engineering work and may not be possible on performance 
grounds (I'm not sure how much we rely on lazy initialization).
3. Tie the lifetime of lazily initialized data to `llvm_shutdown`. A lot of 
engineering work, but at least shouldn't affect performance.

There's also a partial solution that might be useful for your use case, and one 
less-rigorous solution that may be practical.

4. The partial solution: Force LLJIT to look up deinitialization symbols at 
initialization time (this could be always on, or optional). This is only a 
partial solution since the `LLJIT` client would also have to guarantee that any 
code reachable through the deinitializers has also already been compiled, which 
may be difficult where lazy compilation is enabled. 

5. The less-rigorous solution: Add an `atexit`callback _after_ you've compiled 
enough JIT'd code to be confident that any globals that will be needed at 
destruction time have already been constructed (and so had their destructors 
registered). E.g.

```c++
clang::Interpreter &getInterpreter() {
  static auto I = createInterpreter();
  // Trigger compilation, hopefully forcing initialization of any function-local
  // statics needed at destruction time.
  I.ParseAndExecute("void noop_fn(void) {} noop_fn()");
  // This scope_exit destructor should run before the destruction of any
  // function-local statics initialized above, allowing compilation (including
  // lazy compilation) of the destructors to succeed.
  static llvm::scope_exit RunDestructors([&]() { I.runDestructors(); });
  return *I;
}
```

The problem with this is that there's no good way to know what code you need to 
run through the compiler to force initialization of all of the function-local 
statics that may be needed during compilation, since this could depend on the 
specific code being compiled.

https://github.com/llvm/llvm-project/pull/196874
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to