oraluben commented on issue #464: URL: https://github.com/apache/tvm-ffi/issues/464#issuecomment-4493962359
## Update: confirmed root cause is ABI mismatch, not missing initialization After isolation testing and cross-version verification: ### Verified root cause The tilelang 0.1.9 release binary is compiled against **vendored tvm-ffi v0.1.3 headers**. At runtime, linking against **tvm-ffi >= 0.1.8** causes an ABI mismatch: - v0.1.3: `TVMFFIErrorCell` = 56 bytes (kind, message, backtrace, update_backtrace) — **no cause_chain/extra_context** - v0.1.8+: `TVMFFIErrorCell` = 72 bytes (+ cause_chain, extra_context) The ErrorObj destructor in libtvm_ffi.dylib accesses `cause_chain` at the v0.1.8+ offset, but due to mismatched internal struct layouts between compile-time and runtime, the memory at that offset contains stale data instead of nullptr. ### Evidence | tilelang build (vendored) | tvm-ffi runtime | Result | |---|---|---| | v0.1.9 release (v0.1.3) | 0.1.7 (56B cell) | ✓ | | v0.1.9 release (v0.1.3) | 0.1.8 (72B cell) | SIGBUS | | v0.1.9 release (v0.1.3) | 0.1.10 (72B cell) | SIGBUS | | v0.1.9 release (v0.1.3) | 0.1.11 (72B cell) | SIGBUS | | refactor branch (v0.1.11) | 0.1.10 (72B cell) | ✓ | | refactor branch (v0.1.11) | 0.1.11 (72B cell) | ✓ | Only crashes when vendored tvm-ffi != runtime tvm-ffi in struct layout. ### Fix For downstream projects: update your vendored tvm-ffi to match or exceed the runtime version being linked. For tvm-ffi: the POC PR #592 adds defensive zeroing in SafeCallContext as a safety net, but the real issue is ABI compatibility across minor versions. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
