junrushao commented on issue #307: URL: https://github.com/apache/tvm-ffi/issues/307#issuecomment-3610468473
Here's what ChatGPT says: --------------------------------- Short version: that PR moved all the “potentially‑throwing string work” *out* of a `noexcept` function and into a normal function. With GCC 14 that’s exactly what toggles whether the compiler emits a reference to `__cxa_call_terminate`, so the symbol set of `libtvm_ffi.so` changes before vs. after the PR. --- ### What PR #240 actually changed From the mail-archive diff of tvm-ffi PR #240, the key changes in `include/tvm/ffi/error.h` are: ([[Mail Archive](https://www.mail-archive.com/commits%40tvm.apache.org/msg116566.html)][1]) 1. **Introduced `Error::FullMessage()`**: ```cpp std::string FullMessage() const { ErrorObj* obj = static_cast<ErrorObj*>(data_.get()); return (std::string("Traceback (most recent call last):\n") + TracebackMostRecentCallLast() + std::string(obj->kind.data, obj->kind.size) + std::string(": ") + std::string(obj->message.data, obj->message.size) + '\n'); } ``` This builds the full traceback + kind + message, using a bunch of `std::string` operations that can allocate and therefore **can throw**. 2. **Changed `Error::what()` to a trivial “just return pointer” implementation**: ```cpp const char* what() const noexcept(true) override { ErrorObj* obj = static_cast<ErrorObj*>(data_.get()); return obj->message.data; } ``` Before the PR, the implementation of `what()` itself *built* that full string: ```cpp const char* what() const noexcept(true) override { thread_local std::string what_data; ErrorObj* obj = static_cast<ErrorObj*>(data_.get()); what_data = (std::string("Traceback (most recent call last):\n") + TracebackMostRecentCallLast() + std::string(obj->kind.data, obj->kind.size) + std::string(": ") + std::string(obj->message.data, obj->message.size) + '\n'); return what_data.c_str(); } ``` 3. **`ErrorBuilder::~ErrorBuilder()` still throws, but now logs via `FullMessage()` instead of `what()`**: ```cpp [[noreturn]] ~ErrorBuilder() noexcept(false) { ::tvm::ffi::Error error(std::move(kind_), stream_.str(), std::move(backtrace_)); if (log_before_throw_) { std::cerr << error.FullMessage(); // previously error.what() } throw error; } ``` So the *signature* of the destructor and of `what()` didn’t change in this patch (the destructor was already `noexcept(false)`); what changed is **where the string-building work lives**. --- ### How that affects `__cxa_call_terminate` GCC 14 introduced a new runtime helper `__cxa_call_terminate()` in libstdc++ to implement the behavior “a `noexcept` function let an exception escape → call `std::terminate()`.” If you compile code with GCC 14 that has a `noexcept` function whose body *may throw*, the compiler emits a reference to this symbol. Older libstdc++ doesn’t have it, which is why mixing GCC 14–compiled code with older libstdc++ leads to undefined‑reference / undefined‑symbol errors. ([[Stack Overflow](https://stackoverflow.com/questions/79815074/linking-fails-with-in-function-main-cold-undefined-reference-to-cxa-call)][2]) Now look at the two versions: * **Before PR #240** `Error::what()` is declared `noexcept(true)` and its body constructs a `std::string` using concatenation and `TracebackMostRecentCallLast()`. All of those `std::string` operations are *not* `noexcept`, so from the compiler’s point of view, `what()` is a `noexcept` function containing potentially-throwing code. Under GCC 14, that combination (noexcept + potentially-throwing) triggers generation of code which, if an exception propagates out of `what()`, calls `__cxa_call_terminate`. That makes `__cxa_call_terminate` show up as an imported symbol of `libtvm_ffi.so`. * **After PR #240** `what()` still has the `noexcept(true)` signature, but its body is now just: ```cpp ErrorObj* obj = static_cast<ErrorObj*>(data_.get()); return obj->message.data; ``` These operations are trivially non-throwing (no allocations, no `std::string` operations, no explicit `throw`). The compiler can now *prove* that `what()` cannot throw, so it no longer needs to generate any “if an exception escapes, call terminate” path, and therefore it no longer emits a reference to `__cxa_call_terminate` from this function. The heavy string-building logic moved into `FullMessage()`, which is **not declared `noexcept`**, and is called from a destructor that is also `noexcept(false)`. Exceptions are allowed to propagate normally there, so no `__cxa_call_terminate` is needed. That’s why, when you compare `libtvm_ffi.so` built from before vs. after PR #240 (with GCC 14), `grep -nr __cxa_call_terminate` shows a change in the symbol set: the pre‑PR library references `__cxa_call_terminate`, the post‑PR one doesn’t. ([[Mail Archive](https://www.mail-archive.com/commits%40tvm.apache.org/msg116566.html)][1]) --- ### How this relates to your `undefined symbol: __cxa_call_terminate` Putting it together with your original failure: * Your `libtvm.so` (from TileLang) depends on `libtvm_ffi.so`, which in turn is built from tvm-ffi >= 0.1.2 that incorporates PR #240 or nearby changes, using GCC 14’s toolchain. ([[GitHub](https://github.com/apache/tvm-ffi/issues/307)][3]) * On your runtime system, the `libstdc++.so.6` that gets loaded is **older** and doesn’t provide `__cxa_call_terminate`. * If your particular tvm-ffi wheel still references `__cxa_call_terminate` (e.g., built just before PR #240, or from other code paths), the dynamic loader can’t resolve that symbol and you see: ```text OSError: ... libtvm.so: undefined symbol: __cxa_call_terminate ``` After PR #240, once everything is rebuilt with it, the `Error::what()` path no longer requires `__cxa_call_terminate`, so that particular dependency goes away and the symptom can disappear—even though the *root* mismatch (GCC 14 vs. older libstdc++) is still there. --- ### Practical implication So the contributor’s observation: > the symbol of libtvm_ffi.so itself changes before and after that PR, you can check by `grep -nr __cxa_call_terminate` is spot on: the change in how `Error::what()` is implemented moves potentially-throwing code out of a `noexcept` function, and that’s exactly what turns the `__cxa_call_terminate` reference on or off in the compiled shared library. If you want to be fully robust: * Make sure **all** the C++ pieces (tvm-ffi, tvm, tilelang) are built against the same libstdc++ ABI version, or * Stick to a tvm-ffi version/wheel that doesn’t end up referencing `__cxa_call_terminate` on your platform, or * Upgrade your runtime libstdc++ to one that includes `__cxa_call_terminate` (the GCC‑14 era libstdc++). ([[Stack Overflow](https://stackoverflow.com/questions/79815074/linking-fails-with-in-function-main-cold-undefined-reference-to-cxa-call)][2]) But the specific “why did the symbol in libtvm_ffi.so change before/after that PR?” is: **because PR #240 moved throwing string operations out of a `noexcept` function, so GCC 14 stopped emitting the `__cxa_call_terminate` path for it.** [1]: https://www.mail-archive.com/commits%40tvm.apache.org/msg116566.html "(tvm-ffi) branch main updated: [ERROR] Make Error conform more to std (#240)" [2]: https://stackoverflow.com/questions/79815074/linking-fails-with-in-function-main-cold-undefined-reference-to-cxa-call "c++ - Linking fails with: in function `main.cold': undefined reference to `__cxa_call_terminate' - Stack Overflow" [3]: https://github.com/apache/tvm-ffi/issues/307 "[Bug] Wheel works with 0.1.0 does not work with 0.1.4 · Issue #307 · apache/tvm-ffi · GitHub" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
