(tvm-ffi) branch main updated: Add Support for NVIDIA Ampere GPUs in _get_cuda_target (#440)

junrushao Fri, 13 Feb 2026 12:54:56 -0800

This is an automated email from the ASF dual-hosted git repository.

junrushao pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm-ffi.git



The following commit(s) were added to refs/heads/main by this push:
     new 395db3c  Add Support for NVIDIA Ampere GPUs in _get_cuda_target (#440)
395db3c is described below

commit 395db3cef62f430831eb9e927357334ed3fdfade
Author: Yuhong Guo <[email protected]>
AuthorDate: Sat Feb 14 04:54:45 2026 +0800

    Add Support for NVIDIA Ampere GPUs in _get_cuda_target (#440)
    
    I'm using SGLang, which relies on TVM-FFI, on a machine equipped with an
    NVIDIA A10 GPU. We encountered the following error:
    <img width="2864" height="2180" alt="image"
    
src="https://github.com/user-attachments/assets/d014ed35-7940-44b1-bf36-6950a9d6d14f";
    />
    
    This issue is commonly observed on NVIDIA Ampere-generation GPUs (e.g.,
    A10, A100) — see related discussion:
    https://github.com/sgl-project/sglang/issues/18108 ,
    https://github.com/sgl-project/sglang/pull/18496 ,
    https://github.com/apache/tvm-ffi/issues/430.
    
    The root cause is that older NVIDIA drivers (commonly deployed on Ampere
    systems) do not support the compute_cap query field in nvidia-smi. As a
    result, _get_cuda_target fails when trying to auto-detect the CUDA
    compute capability.
    
    <img width="862" height="66" alt="image"
    
src="https://github.com/user-attachments/assets/66af252d-baeb-48f7-af1a-539e27d62899";
    />
    To address this, we fall back to querying the GPU name via:
    
    ```bash
    nvidia-smi --query-gpu=name --format=csv,noheader
    ```
    
    
    <img width="728" height="96" alt="image"
    
src="https://github.com/user-attachments/assets/102c58f4-69ba-4649-ad06-17faaf686699";
    />
    
    and then map known Ampere GPU names (e.g., "NVIDIA A10") to their
    corresponding compute capabilities (e.g., 8.6).
    
    This change enables robust GPU detection on Ampere devices with legacy
    drivers while maintaining backward compatibility.
    
    ---------
    
    Co-authored-by: gemini-code-assist[bot] 
<176961590+gemini-code-assist[bot]@users.noreply.github.com>
---
 python/tvm_ffi/cpp/extension.py | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/python/tvm_ffi/cpp/extension.py b/python/tvm_ffi/cpp/extension.py
index e988a77..b558fb0 100644
--- a/python/tvm_ffi/cpp/extension.py
+++ b/python/tvm_ffi/cpp/extension.py
@@ -154,8 +154,27 @@ def _get_cuda_target() -> str:
             major, minor = compute_cap.split(".")
             return 
f"-gencode=arch=compute_{major}{minor},code=sm_{major}{minor}"
         except Exception:
-            # fallback to a reasonable default
-            return "-gencode=arch=compute_70,code=sm_70"
+            try:
+                # For old drivers, there is no compute_cap, but we can use the 
GPU name to determine the architecture.
+                ampere_arch_map = {
+                    "A100": ("8", "0"),
+                    "A10": ("8", "6"),
+                }
+                status = subprocess.run(
+                    args=["nvidia-smi", "--query-gpu=name", 
"--format=csv,noheader"],
+                    capture_output=True,
+                    check=True,
+                    text=True,
+                )
+                gpu_name = status.stdout.strip().split("\n")[0]
+                for gpu_key, (major, minor) in ampere_arch_map.items():
+                    if gpu_key in gpu_name:
+                        return 
f"-gencode=arch=compute_{major}{minor},code=sm_{major}{minor}"
+            except (subprocess.CalledProcessError, FileNotFoundError):
+                pass
+            raise RuntimeError(
+                "Could not detect CUDA compute_cap automatically. Please set 
TVM_FFI_CUDA_ARCH_LIST environment variable."
+            )
 
 
 def _run_command_in_dev_prompt(

(tvm-ffi) branch main updated: Add Support for NVIDIA Ampere GPUs in _get_cuda_target (#440)

Reply via email to