This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/main by this push:
     new a17b5d99c7 update
a17b5d99c7 is described below

commit a17b5d99c7e540bf553a686bc6544c00f21d9d51
Author: tqchen <[email protected]>
AuthorDate: Wed Oct 22 08:13:48 2025 -0700

    update
---
 _posts/2025-10-21-tvm-ffi.md | 26 ++++++++++++++++++--------
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/_posts/2025-10-21-tvm-ffi.md b/_posts/2025-10-21-tvm-ffi.md
index c6e6252f81..98a0d824bc 100644
--- a/_posts/2025-10-21-tvm-ffi.md
+++ b/_posts/2025-10-21-tvm-ffi.md
@@ -7,17 +7,17 @@
 
 
 
-We are currently living in an exciting era for AI, where machine learning 
systems and infrastructures are crucial for training and deploying efficient AI 
models. The modern machine learning systems landscape comes rich with diverse 
components, including popular ML frameworks and array libraries like JAX, 
PyTorch, and CuPy. It also includes specialized libraries such as 
FlashAttention, FlashInfer and cuDNN. Furthermore, there's a growing trend of 
ML compilers and domain-specific languages [...]
+We are currently living in an exciting era for AI, where machine learning 
systems and infrastructures are crucial for training and deploying efficient AI 
models. The modern machine learning systems landscape comes rich with diverse 
components, including popular ML frameworks and array libraries like JAX, 
PyTorch, and CuPy. It also includes specialized libraries such as 
FlashAttention, FlashInfer and cuDNN. Furthermore, there's a growing trend of 
ML compilers and domain-specific languages [...]
 
-The exciting growth of the ecosystem is the reason for the fast pace of 
innovation in AI today. However, it also presents a significant challenge: 
**interoperability**. Many of those components need to integrate with each 
other. For example, libraries such as FlashInfer, cuDNN needs to be integrated 
into PyTorch, JAX, TensorRT’s runtime system, each may come with different 
interface requirements. ML compilers and DSLs also usually expose Python JIT 
binding support, while also need to bri [...]
+The exciting growth of the ecosystem is the reason for today's fast pace of 
innovation in AI. However, it also presents a significant challenge: 
**interoperability**. Many of those components need to integrate with each 
other. For example, libraries such as FlashInfer and cuDNN need to be 
integrated into PyTorch, JAX, and TensorRT's runtime system, each of which may 
come with different interface requirements. ML compilers and DSLs also usually 
expose Python JIT binding support, while als [...]
 
 ![image](/images/tvm-ffi/interop-challenge.png){: style="width: 70%; margin: 
auto; display: block;" }
 
-The the core of these interoperability challenges are the **Application Binary 
Interface (ABI)** and the **Foreign Function Interface (FFI)**. **ABI** defines 
how data structures are stored in memory and precisely what occurs when a 
function is called. For instance, the way torch stores Tensors may be different 
from say cupy/numpy, so we cannot directly pass a torch.Tensor pointer and its 
treatment as a cupy.NDArray. The very nature of machine learning applications 
usually mandates cross [...]
+At the core of these interoperability challenges are the **Application Binary 
Interface (ABI)** and the **Foreign Function Interface (FFI)**. **ABI** defines 
how data structures are stored in memory and precisely what occurs when a 
function is called. For instance, the way PyTorch stores Tensors may be 
different from CuPy/NumPy, so we cannot directly pass a torch.Tensor pointer 
and treat it as a cupy.NDArray. The very nature of machine learning 
applications usually mandates cross-languag [...]
 
-All of the above observations call for a **need for ABI and FFI for the ML 
systems** use-cases. Looking at the state today, luckily, we do have something 
to start with – the C ABI, which every programming language speaks and remains 
stable over time. Unfortunately, C only focuses on low-level data types such as 
int, float and raw pointers. On the other end of the spectrum, we know that 
python is something that must gain first-class support, but also there is still 
a need for different-la [...]
+All of the above observations call for a **need for ABI and FFI for ML 
systems** use cases. Looking at the current state, luckily, we do have 
something to start with – the C ABI, which every programming language speaks 
and remains stable over time. Unfortunately, C only focuses on low-level data 
types such as int, float and raw pointers. On the other end of the spectrum, we 
know that Python is something that must gain first-class support, but there is 
still a need for different-language  [...]
 
-This post introduces TVM FFI, an **open ABI and FFI for machine learning 
systems**. The project evolved from multiple years of ABI calling conventions 
design iterations in the Apache TVM project. We find that the design can be 
made generic, independent from the choice of compiler/language and should 
benefit the ML systems community. As a result, we brought into a minimal 
library built from the ground up with a clear intention to become an open, 
standalone library that can be shared and e [...]
+This post introduces TVM FFI, an **open ABI and FFI for machine learning 
systems**. The project evolved from multiple years of ABI calling conventions 
design iterations in the Apache TVM project. We find that the design can be 
made generic, independent of the choice of compiler/language and should benefit 
the ML systems community. As a result, we built a minimal library from the 
ground up with a clear intention to become an open, standalone library that can 
be shared and evolved together [...]
 
 - **Stable, minimal C ABI** designed for kernels, DSLs, and runtime 
extensibility.
 - **Zero-copy interop** across PyTorch, JAX, and CuPy using [DLPack 
protocol](https://data-apis.org/array-api/2024.12/design_topics/data_interchange.html).
@@ -31,13 +31,13 @@ Importantly, the goal of the project is not to create 
another framework or langu
 
 ## **Technical Design**
 
-To start with, we need a mechanism to store the values that are passing across 
machine learning frameworks. It achieves this using a core data structure 
called TVMFFIAny. It is a 16 bytes C structure that follows the design 
principle of tagged-union
+To start with, we need a mechanism to store the values that are passed across 
machine learning frameworks. It achieves this using a core data structure 
called TVMFFIAny. It is a 16-byte C structure that follows the design principle 
of tagged union
 
 ![image](/images/tvm-ffi/tvmffiany.png){: style="width: 50%; margin: auto; 
display: block;" }
 
 
 
-The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps to manage 
type information and deletion. This design allows us to use the same type_index 
mechanism that allows for the future growth and recognition of new kinds of 
objects within the FFI, ensuring extensibility. The standalone deleter ensures 
objects can be safely allocated by one source or language and deleted in 
another place.
+The objects in TVMFFIObject are managed as intrusive pointers, where 
TVMFFIObject itself contains the header of the pointer that helps manage type 
information and deletion. This design allows us to use the same type_index 
mechanism that allows for future growth and recognition of new kinds of objects 
within the FFI, ensuring extensibility. The standalone deleter ensures objects 
can be safely allocated by one source or language and deleted in another place.
 
 ![image](/images/tvm-ffi/tvmffiobject.png){: style="width: 50%; margin: auto; 
display: block;" }
 
@@ -97,8 +97,18 @@ Once DSL integrates with the ABI, we can leverage the same 
flow to load back and
 ![image](/images/tvm-ffi/mydsl.png){: style="width: 40%; margin: auto; 
display: block;" }
 
 
+## Core Design Principle and Applications
 
-As we can see, the common open ABI foundation offers numerous opportunities 
for ML systems to interoperate. We anticipate that this solution can 
significantly benefit various aspects of ML systems and AI infrastructure:
+
+Coming back to the high level, the core design principle of the TVM FFI ABI is 
to decouple the ABI design from the binding itself.
+Most binding generators or connectors focus on point-to-point interop between 
language A and framework B.
+By designing a common ABI foundation, we can transform point-to-point interop 
into a mix-and-match approach, where
+we can have n languages/frameworks connect to the ABI and then back to another 
m DSLs/libraries. The most obvious use case
+is to expose C++ functions to Python; but we can also use the same mechanism 
to expose C++ functions to Rust;
+the ABI helps expose WebAssembly/WebGPU to TypeScript in the recent WebLLM 
project,
+or expose DSL-generated kernels to these environments. It can also use the ABI 
as a common runtime foundation for compiler
+runtime co-design in ML compilers and kernel DSLs. These are just some of the 
opportunities we may unblock.
+In summary, the common open ABI foundation offers numerous opportunities for 
ML systems to interoperate. We anticipate that this solution can significantly 
benefit various aspects of ML systems and AI infrastructure:
 
 * **Kernel libraries**: Ship a single package to support multiple frameworks, 
Python versions, and different languages.
 * **Kernel DSLs**:  a reusable ABI for JIT and AOT kernel exposure frameworks 
and runtimes.

Reply via email to