This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch nightly
in repository https://gitbox.apache.org/repos/asf/tvm.git


    from 4454f8d771 [Web]Allows setting powerPreference on webgpu (#17545)
     add 567eeed38b [Runtime][Dist] Implementation of KV cache transfer (#17557)

No new revisions were added by this update.

Summary of changes:
 3rdparty/flashinfer                                |   2 +-
 CMakeLists.txt                                     |   4 +-
 docs/how_to/tutorials/optimize_llm.py              |   1 +
 python/tvm/relax/frontend/nn/llm/kv_cache.py       |  33 +-
 src/runtime/contrib/nvshmem/init.cc                |  58 ++-
 src/runtime/contrib/nvshmem/kv_transfer.cu         | 333 ++++++++++++
 src/runtime/contrib/nvshmem/memory_allocator.cc    |   3 +-
 src/runtime/disco/nccl/nccl.cc                     |  10 +-
 src/runtime/relax_vm/kv_state.cc                   |   4 +
 src/runtime/relax_vm/kv_state.h                    |   8 +
 src/runtime/relax_vm/paged_kv_cache.cc             | 385 +++++++++++++-
 tests/python/disco/test_nvshmem.py                 |   4 +-
 .../test_runtime_builtin_kv_cache_transfer.py}     | 565 +++++----------------
 ...est_runtime_builtin_kv_cache_transfer_kernel.py | 252 +++++++++
 ...runtime_builtin_paged_attention_kv_cache_tir.py |   1 +
 15 files changed, 1198 insertions(+), 465 deletions(-)
 create mode 100644 src/runtime/contrib/nvshmem/kv_transfer.cu
 copy tests/python/relax/{test_runtime_builtin_paged_attention_kv_cache_tir.py 
=> nvshmem/test_runtime_builtin_kv_cache_transfer.py} (57%)
 create mode 100644 
tests/python/relax/nvshmem/test_runtime_builtin_kv_cache_transfer_kernel.py

Reply via email to