George-Polya opened a new issue, #17732:
URL: https://github.com/apache/tvm/issues/17732
### Expected behavior
call the global function `vm.builtin.paged_attention_kv_cache_popn`
### Actual behavior
```
01:06:07 | INFO | loading NVILA-Lite-2B from
/data/models/mlc/dist/NVILA-Lite-2B/ctx32768/NVILA-Lite-2B-q4f16_ft/NVILA-Lite-2B-q4f16_ft-cuda.so
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/opt/NanoLLM/nano_llm/vision/video.py", line 44, in <module>
model = NanoLLM.from_pretrained(
File "/opt/NanoLLM/nano_llm/nano_llm.py", line 91, in from_pretrained
model = MLCModel(model_path, **kwargs)
File "/opt/NanoLLM/nano_llm/models/mlc.py", line 173, in __init__
self._kv_cache_pop =
tvm.get_global_func('vm.builtin.paged_attention_kv_cache_popn') #
'vm.builtin.kv_state_popn')
File "/usr/local/lib/python3.10/dist-packages/tvm/_ffi/registry.py", line
235, in get_global_func
return _get_global_func(name, allow_missing)
File "tvm/_ffi/_cython/./packed_func.pxi", line 352, in
tvm._ffi._cy3.core._get_global_func
ValueError: Cannot find global function
vm.builtin.paged_attention_kv_cache_popn
```
If I use tvm==0.15.0, I don't see this error, but I see
[vm.builtin.paged_attention_kv_cache_attention_with_fused_qkv
error](https://github.com/mlc-ai/mlc-llm/issues/2018)
### Environment
- Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): CUDA
- Operating system (e.g. Ubuntu/Windows/MacOS/...): Jetpack 6.1(Ubuntu
22.04)
- Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...): Jetson Orin AGX
- How you installed MLC-LLM (`conda`, source): source, mlc-llm==0.19.0
- How you installed TVM-Unity (`pip`, source): pip / tvm==0.19.0
- Python version (e.g. 3.10): 3.10
- GPU driver version (if applicable): 540.4.0
- CUDA/cuDNN version (if applicable): 12.6
- TVM Unity Hash Tag (`python -c "import tvm; print('\n'.join(f'{k}: {v}'
for k, v in tvm.support.libinfo().items()))"`,
```
applicable if you compile models):
USE_NVTX: OFF
USE_GTEST: OFF
SUMMARIZE: ON
TVM_DEBUG_WITH_ABI_CHANGE: OFF
USE_IOS_RPC: OFF
USE_MSC: OFF
USE_ETHOSU:
CUDA_VERSION: 12.6
USE_LIBBACKTRACE: OFF
DLPACK_PATH: 3rdparty/dlpack/include
USE_TENSORRT_CODEGEN: OFF
USE_THRUST: ON
USE_TARGET_ONNX: OFF
USE_AOT_EXECUTOR: OFF
BUILD_DUMMY_LIBTVM: OFF
USE_CUDNN: ON
USE_TENSORRT_RUNTIME: OFF
USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR: OFF
USE_CCACHE: ON
USE_ARM_COMPUTE_LIB: OFF
USE_CPP_RTVM:
USE_OPENCL_GTEST: /path/to/opencl/gtest
TVM_LOG_BEFORE_THROW: OFF
USE_MKL: OFF
USE_PT_TVMDSOOP: OFF
MLIR_VERSION: NOT-FOUND
USE_CLML: OFF
USE_STACKVM_RUNTIME: ON
USE_GRAPH_EXECUTOR_CUDA_GRAPH: ON
ROCM_PATH: /opt/rocm
USE_DNNL: OFF
USE_MSCCL: OFF
USE_NNAPI_RUNTIME: OFF
USE_VITIS_AI: OFF
USE_MLIR: OFF
USE_RCCL: OFF
USE_LLVM: /usr/bin/llvm-config --link-static
USE_VERILATOR: OFF
USE_TF_TVMDSOOP: OFF
USE_THREADS: ON
USE_MSVC_MT: OFF
BACKTRACE_ON_SEGFAULT: OFF
USE_GRAPH_EXECUTOR: ON
USE_NCCL: OFF
USE_ROCBLAS: OFF
GIT_COMMIT_HASH: 3f30919055d864af3dd03c42b3cb0a878aa2cc25
USE_VULKAN: OFF
USE_RUST_EXT: OFF
USE_CUTLASS: ON
USE_CPP_RPC: OFF
USE_HEXAGON: OFF
USE_CUSTOM_LOGGING: OFF
USE_UMA: OFF
USE_FALLBACK_STL_MAP: OFF
USE_SORT: ON
USE_RTTI: ON
GIT_COMMIT_TIME: 2024-09-26 09:52:46 -0400
USE_HIPBLAS: OFF
USE_HEXAGON_SDK: /path/to/sdk
USE_BLAS: none
USE_ETHOSN: OFF
USE_LIBTORCH: OFF
USE_RANDOM: ON
USE_CUDA: ON
USE_COREML: OFF
USE_AMX: OFF
BUILD_STATIC_RUNTIME: OFF
USE_CMSISNN: OFF
USE_KHRONOS_SPIRV: OFF
USE_CLML_GRAPH_EXECUTOR: OFF
USE_TFLITE: OFF
USE_HEXAGON_GTEST: /path/to/hexagon/gtest
PICOJSON_PATH: 3rdparty/picojson
USE_OPENCL_ENABLE_HOST_PTR: OFF
INSTALL_DEV: OFF
USE_PROFILER: OFF
USE_NNPACK: OFF
LLVM_VERSION: 17.0.6
USE_MRVL: OFF
USE_OPENCL: OFF
COMPILER_RT_PATH: 3rdparty/compiler-rt
USE_NNAPI_CODEGEN: OFF
RANG_PATH: 3rdparty/rang/include
USE_SPIRV_KHR_INTEGER_DOT_PRODUCT: OFF
USE_OPENMP: OFF
USE_BNNS: OFF
USE_FLASHINFER:
USE_CUBLAS: ON
USE_METAL: OFF
USE_MICRO_STANDALONE_RUNTIME: OFF
USE_HEXAGON_EXTERNAL_LIBS: OFF
USE_ALTERNATIVE_LINKER: AUTO
USE_BYODT_POSIT: OFF
USE_NVSHMEM: OFF
USE_HEXAGON_RPC: OFF
USE_MICRO: OFF
DMLC_PATH: 3rdparty/dmlc-core/include
INDEX_DEFAULT_I64: ONa
USE_RELAY_DEBUG: OFF
USE_RPC: OFF
USE_TENSORFLOW_PATH: none
TVM_CLML_VERSION:
USE_MIOPEN: OFF
USE_ROCM: OFF
USE_PAPI: OFF
USE_CURAND: ON
TVM_CXX_COMPILER_PATH: /usr/bin/c++
HIDE_PRIVATE_SYMBOLS: ON
```
### Steps to reproduce
1. call `tvm.get_global_func('vm.builtin.paged_attention_kv_cache_popn')`
### Triage
Please refer to the list of label tags
[here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the
relevant tags and add them below in a bullet format (example below).
* needs-triage
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]