This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a change to branch nightly
in repository https://gitbox.apache.org/repos/asf/tvm.git
from 4454f8d771 [Web]Allows setting powerPreference on webgpu (#17545)
add 567eeed38b [Runtime][Dist] Implementation of KV cache transfer (#17557)
No new revisions were added by this update.
Summary of changes:
3rdparty/flashinfer | 2 +-
CMakeLists.txt | 4 +-
docs/how_to/tutorials/optimize_llm.py | 1 +
python/tvm/relax/frontend/nn/llm/kv_cache.py | 33 +-
src/runtime/contrib/nvshmem/init.cc | 58 ++-
src/runtime/contrib/nvshmem/kv_transfer.cu | 333 ++++++++++++
src/runtime/contrib/nvshmem/memory_allocator.cc | 3 +-
src/runtime/disco/nccl/nccl.cc | 10 +-
src/runtime/relax_vm/kv_state.cc | 4 +
src/runtime/relax_vm/kv_state.h | 8 +
src/runtime/relax_vm/paged_kv_cache.cc | 385 +++++++++++++-
tests/python/disco/test_nvshmem.py | 4 +-
.../test_runtime_builtin_kv_cache_transfer.py} | 565 +++++----------------
...est_runtime_builtin_kv_cache_transfer_kernel.py | 252 +++++++++
...runtime_builtin_paged_attention_kv_cache_tir.py | 1 +
15 files changed, 1198 insertions(+), 465 deletions(-)
create mode 100644 src/runtime/contrib/nvshmem/kv_transfer.cu
copy tests/python/relax/{test_runtime_builtin_paged_attention_kv_cache_tir.py
=> nvshmem/test_runtime_builtin_kv_cache_transfer.py} (57%)
create mode 100644
tests/python/relax/nvshmem/test_runtime_builtin_kv_cache_transfer_kernel.py