This is an automated email from the ASF dual-hosted git repository.
tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new ce1d635ce9d deploying docs
(apache/tvm@58d6a5a8f972002ba17f387d961c867a418d987e)
ce1d635ce9d is described below
commit ce1d635ce9dee118a963134150c78140282cbebc
Author: tvm-bot <[email protected]>
AuthorDate: Tue May 26 17:45:53 2026 +0000
deploying docs (apache/tvm@58d6a5a8f972002ba17f387d961c867a418d987e)
---
.../import_model.zip | Bin 35138 -> 35138 bytes
.../dlight_gpu_scheduling.zip | Bin 26311 -> 26311 bytes
.../11c11e53c7dace51a8be968ee169ed0d/ir_module.zip | Bin 23790 -> 23790 bytes
.../tir_transformation.zip | Bin 15923 -> 15923 bytes
.../meta_schedule.zip | Bin 24239 -> 24239 bytes
.../mix_python_and_tvm_with_pymodule.zip | Bin 39023 -> 39023 bytes
.../relax_creation.zip | Bin 22503 -> 22503 bytes
.../relax_transformation.zip | Bin 11480 -> 11480 bytes
.../optimize_llm.zip | Bin 54007 -> 54007 bytes
.../bring_your_own_codegen.zip | Bin 18423 -> 18423 bytes
.../e2e_opt_model.zip | Bin 14501 -> 14501 bytes
.../quick_start.zip | Bin 16252 -> 16252 bytes
.../export_and_load_executable.zip | Bin 31408 -> 31408 bytes
.../tir_creation.zip | Bin 24545 -> 24545 bytes
.../cross_compilation_and_rpc.zip | Bin 62975 -> 62975 bytes
.../customize_opt.zip | Bin 20544 -> 20544 bytes
.../relax/tutorials/sg_execution_times.rst.txt | 4 ++--
.../tensor_ir/tutorials/sg_execution_times.rst.txt | 8 +++----
.../tensor_ir/tutorials/tir_creation.rst.txt | 20 ++++++++--------
.../tensor_ir/tutorials/tir_transformation.rst.txt | 6 ++---
.../get_started/tutorials/ir_module.rst.txt | 8 +++----
.../get_started/tutorials/quick_start.rst.txt | 4 ++--
.../tutorials/sg_execution_times.rst.txt | 6 ++---
.../tutorials/cross_compilation_and_rpc.rst.txt | 6 ++---
.../how_to/tutorials/customize_opt.rst.txt | 4 ++--
.../how_to/tutorials/e2e_opt_model.rst.txt | 2 +-
.../how_to/tutorials/sg_execution_times.rst.txt | 14 +++++------
docs/_sources/sg_execution_times.rst.txt | 26 ++++++++++-----------
.../relax/tutorials/sg_execution_times.html | 4 ++--
.../tensor_ir/tutorials/sg_execution_times.html | 8 +++----
.../tensor_ir/tutorials/tir_creation.html | 20 ++++++++--------
.../tensor_ir/tutorials/tir_transformation.html | 6 ++---
docs/get_started/tutorials/ir_module.html | 16 ++++++-------
docs/get_started/tutorials/quick_start.html | 24 +++++++++----------
docs/get_started/tutorials/sg_execution_times.html | 6 ++---
docs/how_to/tutorials/bring_your_own_codegen.html | 4 ++--
.../tutorials/cross_compilation_and_rpc.html | 6 ++---
docs/how_to/tutorials/customize_opt.html | 8 +++----
docs/how_to/tutorials/e2e_opt_model.html | 6 ++---
.../tutorials/export_and_load_executable.html | 8 +++----
docs/how_to/tutorials/import_model.html | 4 ++--
docs/how_to/tutorials/optimize_llm.html | 10 ++++----
docs/how_to/tutorials/sg_execution_times.html | 14 +++++------
docs/objects.inv | Bin 23886 -> 23892 bytes
docs/reference/api/python/relax/op.html | 2 +-
docs/reference/api/python/relax/training.html | 2 +-
docs/reference/api/python/runtime/vm.html | 2 +-
docs/searchindex.js | 2 +-
docs/sg_execution_times.html | 26 ++++++++++-----------
49 files changed, 143 insertions(+), 143 deletions(-)
diff --git a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip
index 5307fa722d4..2e0209f902b 100644
Binary files
a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip and
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip differ
diff --git
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
index b2cd955832d..4a7ab83dc06 100644
Binary files
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
and
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
differ
diff --git a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
index d110c64a5ba..9c10de99cf5 100644
Binary files a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
and b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip differ
diff --git
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
index 4e0e7a38de2..604bf79c0d2 100644
Binary files
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip and
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip differ
diff --git a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip
index ad5fec8450e..6726a366619 100644
Binary files
a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip and
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip differ
diff --git
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
index aa6adc62c50..6c806839696 100644
Binary files
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
and
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
differ
diff --git
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
index 389c97e0066..0636a870aaf 100644
Binary files
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip and
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip differ
diff --git
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
index 5d896cd7b5d..011f90eb564 100644
Binary files
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip and
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
differ
diff --git a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
index 2d1dff97b63..618432f3ff5 100644
Binary files
a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip and
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip differ
diff --git
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
index ccd5ed4be1d..930d8aaa4dc 100644
Binary files
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
and
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
differ
diff --git a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
index d78c891baee..fd4d5941788 100644
Binary files
a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip and
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip differ
diff --git a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
index e2053577edb..039648d84f7 100644
Binary files a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
and b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip differ
diff --git
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
index 877217e72a4..2c41c6aea53 100644
Binary files
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
and
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
differ
diff --git a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
index 7e3a8e33235..6ec2be8fb9e 100644
Binary files
a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip and
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip differ
diff --git
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
index 8a7efdcac84..c9ac8593077 100644
Binary files
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
and
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
differ
diff --git a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
index 7a1c56f7caf..0ea6ac2b8df 100644
Binary files
a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip and
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip differ
diff --git a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
index 02c94414cc1..28636801472 100644
--- a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:00.207** total execution time for 2 files **from
deep_dive/relax/tutorials**:
+**00:00.210** total execution time for 2 files **from
deep_dive/relax/tutorials**:
.. container::
@@ -33,7 +33,7 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py`
(``relax_creation.py``)
- - 00:00.153
+ - 00:00.156
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py`
(``relax_transformation.py``)
- 00:00.054
diff --git
a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
index b9f3d0213f4..aa55d98193b 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:00.700** total execution time for 4 files **from
deep_dive/tensor_ir/tutorials**:
+**00:00.709** total execution time for 4 files **from
deep_dive/tensor_ir/tutorials**:
.. container::
@@ -33,13 +33,13 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py`
(``tir_transformation.py``)
- - 00:00.296
+ - 00:00.301
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py`
(``tir_creation.py``)
- - 00:00.278
+ - 00:00.282
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py`
(``dlight_gpu_scheduling.py``)
- - 00:00.118
+ - 00:00.119
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py`
(``meta_schedule.py``)
- 00:00.007
diff --git a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
index 26891cd60f4..6e8766eb425 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
@@ -321,17 +321,17 @@ Now let's check the runtime dynamic shape inference:
.. code-block:: none
- [[0.95137763 0.85107917 0.3689601 1.0835637 ]
- [1.0141339 0.8445324 0.90911806 1.6136544 ]
- [0.61329514 0.30123305 0.7088936 1.0339592 ]
- [0.3933009 0.32253462 0.5984916 0.91722345]]
- [[30.018133 27.544096 33.03013 ... 28.698242 32.912727 33.821335]
- [33.676186 29.869184 36.656418 ... 29.558659 33.14984 33.684513]
- [32.54471 26.390018 33.711388 ... 28.185299 30.113274 30.87361 ]
+ [[0.9666659 1.2409723 1.5023336 0.59038645]
+ [1.466954 1.6336558 2.014125 0.88805574]
+ [0.84090245 0.84245896 1.2488935 0.5728023 ]
+ [1.6342511 1.7535877 2.2908895 0.8626775 ]]
+ [[31.720383 30.115713 31.008839 ... 35.727104 27.078178 32.137962]
+ [31.90414 31.903435 33.14347 ... 34.99326 31.078535 32.050144]
+ [32.191025 29.924765 31.966501 ... 32.56062 28.905807 31.945398]
...
- [33.102913 26.474215 33.315556 ... 28.016031 29.267859 31.99489 ]
- [30.29241 29.473051 34.34725 ... 28.928728 32.059017 32.707542]
- [34.434856 28.675213 36.80958 ... 29.167341 34.82008 33.18891 ]]
+ [32.578815 31.375263 32.65559 ... 34.989735 29.628883 30.835232]
+ [35.372784 34.941742 35.593662 ... 38.25491 29.94823 33.333088]
+ [31.666214 30.27533 31.315958 ... 35.74374 29.301 30.93808 ]]
diff --git
a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
index 06f7c0c9b97..825a207a158 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
@@ -120,7 +120,7 @@ original implementation.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 2.5840 2.5840 2.5840 2.5840 0.0000
+ 2.5827 2.5827 2.5827 2.5827 0.0000
@@ -294,7 +294,7 @@ action involves reordering these two loops.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.8664 0.8664 0.8664 0.8664 0.0000
+ 0.8629 0.8629 0.8629 0.8629 0.0000
@@ -424,7 +424,7 @@ from the reduction update via the **decompose_reduction**
primitive.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.3553 0.3553 0.3553 0.3553 0.0000
+ 0.3554 0.3554 0.3554 0.3554 0.0000
diff --git a/docs/_sources/get_started/tutorials/ir_module.rst.txt
b/docs/_sources/get_started/tutorials/ir_module.rst.txt
index 4d9472bf1e4..aeec624a78c 100644
--- a/docs/_sources/get_started/tutorials/ir_module.rst.txt
+++ b/docs/_sources/get_started/tutorials/ir_module.rst.txt
@@ -696,8 +696,8 @@ We can deploy the IRModule on CPU by specifying the target
as ``llvm``.
.. code-block:: none
- [[ 0.02594689 0.12035388 -0.09061812 -0.01100541 -0.09562254 0.1914179
- -0.18710697 0.08637098 -0.08076213 -0.08568094]]
+ [[-0.01371118 -0.16339105 0.09833568 -0.17663041 0.01454209 0.11325479
+ -0.12773927 -0.15076809 0.13691261 0.15500405]]
@@ -763,8 +763,8 @@ Now we can compile the IRModule on GPU, the similar way as
we did on CPU.
.. code-block:: none
- [[ 0.02594695 0.12035395 -0.09061813 -0.01100544 -0.09562249 0.1914179
- -0.18710697 0.08637096 -0.08076215 -0.08568095]]
+ [[-0.01371119 -0.16339102 0.0983357 -0.17663044 0.01454208 0.11325476
+ -0.12773928 -0.15076812 0.13691261 0.15500408]]
diff --git a/docs/_sources/get_started/tutorials/quick_start.rst.txt
b/docs/_sources/get_started/tutorials/quick_start.rst.txt
index 2d80f361e41..5fbee4f0ffe 100644
--- a/docs/_sources/get_started/tutorials/quick_start.rst.txt
+++ b/docs/_sources/get_started/tutorials/quick_start.rst.txt
@@ -224,8 +224,8 @@ different devices.
.. code-block:: none
- [[25488.164 24824.139 24354.932 25591.982 24814.775 26580.848 23389.242
- 25353.81 24581.1 23791.096]]
+ [[24241.857 24889.234 25745.719 25249.332 25855.055 25478.053 26986.88
+ 25268.55 24716.8 24717.395]]
diff --git a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
index ffd8fd3e76a..f4af3bd3d2c 100644
--- a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:06.237** total execution time for 2 files **from get_started/tutorials**:
+**00:07.509** total execution time for 2 files **from get_started/tutorials**:
.. container::
@@ -33,8 +33,8 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_get_started_tutorials_ir_module.py` (``ir_module.py``)
- - 00:06.082
+ - 00:07.352
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_quick_start.py`
(``quick_start.py``)
- - 00:00.156
+ - 00:00.157
- 0.0
diff --git a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
index 30811b08104..8e573ce8862 100644
--- a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
@@ -267,7 +267,7 @@ device and returns the measured cost. Network overhead is
excluded.
.. code-block:: none
- 9.7e-08 secs/op
+ 9.3e-08 secs/op
@@ -857,8 +857,8 @@ This workflow is applicable to various deployment scenarios:
Converted PyTorch model to Relax:
- Number of parameters: 4
Using local target for demonstration
- Exported library to: /tmp/tmpj7lc8xaw/model_deployed.so
- Saved parameters to: /tmp/tmpj7lc8xaw/model_params.npz
+ Exported library to: /tmp/tmpiw5zwg5w/model_deployed.so
+ Saved parameters to: /tmp/tmpiw5zwg5w/model_params.npz
RPC workflow (works for any remote device):
==================================================
diff --git a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
index 015585e6c1d..0f9b4040dd2 100644
--- a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
+++ b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
@@ -426,8 +426,8 @@ We can build and deploy the optimized model to the TVM
runtime.
.. code-block:: none
- [[25360.273 25977.055 26176.404 26010.953 23686.574 24671.594 25639.94
- 26352.355 26981.152 24812.078]]
+ [[24617.4 24113.621 23790.748 25276.402 22978.486 24189.844 24161.701
+ 25012.629 24334.504 21885.848]]
diff --git a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
index 2fc22948433..f64db2e0767 100644
--- a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
+++ b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
@@ -54,7 +54,7 @@ PyTorch.
.. code-block:: none
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth"
to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
- 0%| | 0.00/44.7M [00:00<?, ?B/s] 66%|██████▌ |
29.4M/44.7M [00:00<00:00, 308MB/s] 100%|██████████| 44.7M/44.7M
[00:00<00:00, 336MB/s]
+ 0%| | 0.00/44.7M [00:00<?, ?B/s] 67%|██████▋ |
29.8M/44.7M [00:00<00:00, 311MB/s] 100%|██████████| 44.7M/44.7M
[00:00<00:00, 338MB/s]
diff --git a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
index 4ffc2306132..c7b6c4685c9 100644
--- a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:15.518** total execution time for 8 files **from how_to/tutorials**:
+**00:18.827** total execution time for 8 files **from how_to/tutorials**:
.. container::
@@ -33,22 +33,22 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` (``optimize_llm.py``)
- - 00:10.311
+ - 00:10.447
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_import_model.py` (``import_model.py``)
- - 00:03.353
+ - 00:06.478
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` (``customize_opt.py``)
- - 00:00.745
+ - 00:00.712
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` (``e2e_opt_model.py``)
- - 00:00.584
+ - 00:00.665
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py`
(``cross_compilation_and_rpc.py``)
- - 00:00.479
+ - 00:00.476
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py`
(``bring_your_own_codegen.py``)
- - 00:00.040
+ - 00:00.042
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_mix_python_and_tvm_with_pymodule.py`
(``mix_python_and_tvm_with_pymodule.py``)
- 00:00.004
diff --git a/docs/_sources/sg_execution_times.rst.txt
b/docs/_sources/sg_execution_times.rst.txt
index 822b487617a..3fcce8bfba7 100644
--- a/docs/_sources/sg_execution_times.rst.txt
+++ b/docs/_sources/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:22.663** total execution time for 16 files **from all galleries**:
+**00:27.255** total execution time for 16 files **from all galleries**:
.. container::
@@ -33,43 +33,43 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py`
(``../how_to/tutorials/optimize_llm.py``)
- - 00:10.311
+ - 00:10.447
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_ir_module.py`
(``../get_started/tutorials/ir_module.py``)
- - 00:06.082
+ - 00:07.352
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_import_model.py`
(``../how_to/tutorials/import_model.py``)
- - 00:03.353
+ - 00:06.478
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_customize_opt.py`
(``../how_to/tutorials/customize_opt.py``)
- - 00:00.745
+ - 00:00.712
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py`
(``../how_to/tutorials/e2e_opt_model.py``)
- - 00:00.584
+ - 00:00.665
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py`
(``../how_to/tutorials/cross_compilation_and_rpc.py``)
- - 00:00.479
+ - 00:00.476
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py`
(``../deep_dive/tensor_ir/tutorials/tir_transformation.py``)
- - 00:00.296
+ - 00:00.301
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py`
(``../deep_dive/tensor_ir/tutorials/tir_creation.py``)
- - 00:00.278
+ - 00:00.282
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_quick_start.py`
(``../get_started/tutorials/quick_start.py``)
- - 00:00.156
+ - 00:00.157
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py`
(``../deep_dive/relax/tutorials/relax_creation.py``)
- - 00:00.153
+ - 00:00.156
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py`
(``../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py``)
- - 00:00.118
+ - 00:00.119
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py`
(``../deep_dive/relax/tutorials/relax_transformation.py``)
- 00:00.054
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py`
(``../how_to/tutorials/bring_your_own_codegen.py``)
- - 00:00.040
+ - 00:00.042
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py`
(``../deep_dive/tensor_ir/tutorials/meta_schedule.py``)
- 00:00.007
diff --git a/docs/deep_dive/relax/tutorials/sg_execution_times.html
b/docs/deep_dive/relax/tutorials/sg_execution_times.html
index de29545288a..a1e0b6934b6 100644
--- a/docs/deep_dive/relax/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/relax/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
<section id="computation-times">
<span
id="sphx-glr-deep-dive-relax-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:00.207</strong> total execution time for 2 files <strong>from
deep_dive/relax/tutorials</strong>:</p>
+<p><strong>00:00.210</strong> total execution time for 2 files <strong>from
deep_dive/relax/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -319,7 +319,7 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
class="std std-ref">Relax Creation</span></a> (<code class="docutils literal
notranslate"><span class="pre">relax_creation.py</span></code>)</p></td>
-<td><p>00:00.153</p></td>
+<td><p>00:00.156</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span class="pre">relax_transformation.py</span></code>)</p></td>
diff --git a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
index 413511190be..6898e9b5f3c 100644
--- a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
<section id="computation-times">
<span
id="sphx-glr-deep-dive-tensor-ir-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:00.700</strong> total execution time for 4 files <strong>from
deep_dive/tensor_ir/tutorials</strong>:</p>
+<p><strong>00:00.709</strong> total execution time for 4 files <strong>from
deep_dive/tensor_ir/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -319,15 +319,15 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span class="pre">tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.296</p></td>
+<td><p>00:00.301</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
class="std std-ref">TensorIR Creation</span></a> (<code class="docutils
literal notranslate"><span class="pre">tir_creation.py</span></code>)</p></td>
-<td><p>00:00.278</p></td>
+<td><p>00:00.282</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code
class="docutils literal notranslate"><span
class="pre">dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.118</p></td>
+<td><p>00:00.119</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code
class="docutils literal notranslate"><span
class="pre">meta_schedule.py</span></code>)</p></td>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
index f37b089452f..98c431db7fc 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
@@ -495,17 +495,17 @@ be used to ascertain the shape and data type of a
TensorIR.</p>
<span class="nb">print</span><span class="p">(</span><span
class="n">evaluate_dynamic_shape</span><span class="p">(</span><span
class="n">dyn_shape_lib</span><span class="p">,</span> <span
class="n">m</span><span class="o">=</span><span class="mi">64</span><span
class="p">,</span> <span class="n">n</span><span class="o">=</span><span
class="mi">64</span><span class="p">,</span> <a
href="../../../reference/api/python/tirx/tirx.html#tvm.tirx.IterVar"
title="tvm.tirx.IterVar" class="sphx-gl [...]
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[0.95137763 0.85107917 0.3689601
1.0835637 ]
- [1.0141339 0.8445324 0.90911806 1.6136544 ]
- [0.61329514 0.30123305 0.7088936 1.0339592 ]
- [0.3933009 0.32253462 0.5984916 0.91722345]]
-[[30.018133 27.544096 33.03013 ... 28.698242 32.912727 33.821335]
- [33.676186 29.869184 36.656418 ... 29.558659 33.14984 33.684513]
- [32.54471 26.390018 33.711388 ... 28.185299 30.113274 30.87361 ]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[0.9666659 1.2409723 1.5023336
0.59038645]
+ [1.466954 1.6336558 2.014125 0.88805574]
+ [0.84090245 0.84245896 1.2488935 0.5728023 ]
+ [1.6342511 1.7535877 2.2908895 0.8626775 ]]
+[[31.720383 30.115713 31.008839 ... 35.727104 27.078178 32.137962]
+ [31.90414 31.903435 33.14347 ... 34.99326 31.078535 32.050144]
+ [32.191025 29.924765 31.966501 ... 32.56062 28.905807 31.945398]
...
- [33.102913 26.474215 33.315556 ... 28.016031 29.267859 31.99489 ]
- [30.29241 29.473051 34.34725 ... 28.928728 32.059017 32.707542]
- [34.434856 28.675213 36.80958 ... 29.167341 34.82008 33.18891 ]]
+ [32.578815 31.375263 32.65559 ... 34.989735 29.628883 30.835232]
+ [35.372784 34.941742 35.593662 ... 38.25491 29.94823 33.333088]
+ [31.666214 30.27533 31.315958 ... 35.74374 29.301 30.93808 ]]
</pre></div>
</div>
</section>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
index 6e1597fd45c..8d15f85bb20 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
@@ -374,7 +374,7 @@ original implementation.</p>
</div>
<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 2.5840 2.5840 2.5840 2.5840 0.0000
+ 2.5827 2.5827 2.5827 2.5827 0.0000
</pre></div>
</div>
<section id="initialization-schedule">
@@ -472,7 +472,7 @@ class Module:
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.8664 0.8664 0.8664 0.8664 0.0000
+ 0.8629 0.8629 0.8629 0.8629 0.0000
</pre></div>
</div>
</section>
@@ -568,7 +568,7 @@ class Module:
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.3553 0.3553 0.3553 0.3553 0.0000
+ 0.3554 0.3554 0.3554 0.3554 0.0000
</pre></div>
</div>
</section>
diff --git a/docs/get_started/tutorials/ir_module.html
b/docs/get_started/tutorials/ir_module.html
index 23c17c8f0a6..2890ba1eb9b 100644
--- a/docs/get_started/tutorials/ir_module.html
+++ b/docs/get_started/tutorials/ir_module.html
@@ -811,16 +811,16 @@ backends.</p>
<p>We can deploy the IRModule on CPU by specifying the target as <code
class="docutils literal notranslate"><span class="pre">llvm</span></code>.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">exec</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">()</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="n">raw_data</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span
class="p">)</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"float32"</span><span
class="p">)</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">raw_data</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
-<span class="n">cpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"main"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https:// [...]
+<span class="n">cpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#dict"
title="builtins.dict" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params_from_torch</span></a><span class="p">[</ [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">cpu_out</span><span class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[ 0.02594689 0.12035388 -0.09061812
-0.01100541 -0.09562254 0.1914179
- -0.18710697 0.08637098 -0.08076213 -0.08568094]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.01371118 -0.16339105 0.09833568
-0.17663041 0.01454209 0.11325479
+ -0.12773927 -0.15076809 0.13691261 0.15500405]]
</pre></div>
</div>
</section>
@@ -843,19 +843,19 @@ the details of <code class="docutils literal
notranslate"><span class="pre">DLig
<p>Now we can compile the IRModule on GPU, the similar way as we did on
CPU.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">exec</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">raw_data</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span
class="p">,</span> <span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"main"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https:// [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a><span class="p">)</span><s [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">gpu_out</span><span class="p">)</span>
<span class="c1"># Check the correctness of the results</span>
<span class="k">assert</span> <span class="n">np</span><span
class="o">.</span><span class="n">allclose</span><span class="p">(</span><span
class="n">cpu_out</span><span class="p">,</span> <span
class="n">gpu_out</span><span class="p">,</span> <span
class="n">atol</span><span class="o">=</span><span class="mf">1e-3</span><span
class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[ 0.02594695 0.12035395 -0.09061813
-0.01100544 -0.09562249 0.1914179
- -0.18710697 0.08637096 -0.08076215 -0.08568095]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.01371119 -0.16339102 0.0983357
-0.17663044 0.01454208 0.11325476
+ -0.12773928 -0.15076812 0.13691261 0.15500408]]
</pre></div>
</div>
</section>
diff --git a/docs/get_started/tutorials/quick_start.html
b/docs/get_started/tutorials/quick_start.html
index 55fd299666a..4b16dcf0ad2 100644
--- a/docs/get_started/tutorials/quick_start.html
+++ b/docs/get_started/tutorials/quick_start.html
@@ -452,16 +452,16 @@ different devices.</p>
<a href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">target</span></a> <span class="o">=</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span
class="o">.</span><span class="n">target< [...]
<a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span class="n">compile</span [...]
<span class="n">device</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">()</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="n">data</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span
class="p">)</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"float32"</span><span
class="p">)</span>
<span class="n">tvm_data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">data</span><span class="p">,</span> <span
class="n">device</span><span class="o">=</span><span
class="n">device</span><span class="p">)</span>
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">np</span><span class="o">.</span><span
class="n">random</span><span class="o">.</span><span class="n">rand</span><span
class="p">(</span><span class="o">*</span><span class="n">param</span><span
class="o">.</sp [...]
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span
class="n">param</span><span class="p">,</span> <span class="n"> [...]
-<span class="nb">print</span><span class="p">(</span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"forward"</span><span
class="p">](</span><span class="n">tvm_data</span><span class="p">,</span>
<span class="o">*</span><a href="http [...]
+<span class="nb">print</span><span class="p">(</span><span
class="n">vm</span><span class="p">[</span><span
class="s2">"forward"</span><span class="p">](</span><span
class="n">tvm_data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a><span class="p">)</span><s [...]
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[25488.164 24824.139 24354.932 25591.982
24814.775 26580.848 23389.242
- 25353.81 24581.1 23791.096]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[24241.857 24889.234 25745.719 25249.332
25855.055 25478.053 26986.88
+ 25268.55 24716.8 24717.395]]
</pre></div>
</div>
<p>Our goal is to bring machine learning to the application with any language
of interest,
@@ -469,8 +469,8 @@ with the minimum runtime support.</p>
<ul>
<li><p>Each function in IRModule becomes a runnable function in the runtime.
For example in LLM
cases, we can call <code class="docutils literal notranslate"><span
class="pre">prefill</span></code> and <code class="docutils literal
notranslate"><span class="pre">decode</span></code> functions directly.</p>
-<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"prefill"</span><span
class="p">](</span> [...]
-<span class="n">decoded_logits</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"decode"</span><span
class="p">](</span><span class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span>< [...]
+<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span
class="o">=</span> <span class="n">vm</span><span class="p">[</span><span
class="s2">"prefill"</span><span class="p">](</span><span
class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">)</span>
+<span class="n">decoded_logits</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"decode"</span><span class="p">](</span><span
class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">)</span>
</pre></div>
</div>
</li>
@@ -485,15 +485,15 @@ copy exchange with existing ecosystem (DLPack exchange
with PyTorch)</p>
</li>
<li><p>TVM runtime works in non-python environments, so it works on settings
such as mobile</p>
<div class="highlight-C++ notranslate"><div
class="highlight"><pre><span></span><span class="c1">// C++ snippet</span>
-<span class="n">runtime</span><span class="o">::</span><span
class="n">Module</span><span class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w">
</span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.r [...]
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="n">GetFunction</span><span
class="p">(</span><span class="s">"init"</span><span
class="p">)(...);</span>
-<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="n">GetFunction</span><span
class="p">(</span><span [...]
+<span class="n">runtime</span><span class="o">::</span><span
class="n">Module</span><span class="w"> </span><span class="n">vm</span><span
class="w"> </span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a><span class="p">.</span><span class="n">GetFunction [...]
+<span class="n">vm</span><span class="p">.</span><span
class="n">GetFunction</span><span class="p">(</span><span
class="s">"init"</span><span class="p">)(...);</span>
+<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><span class="n">vm</span><span class="p">.</span><span
class="n">GetFunction</span><span class="p">(</span><span
class="s">"prefill"</span><span class="p">)(</span><span
class="n">data</span><span class="p">,</span><span class="w"> </span><span
class="n">weight</span><span class="p">,</span><span class="w"> </span><span
class="n" [...]
</pre></div>
</div>
<div class="highlight-Java notranslate"><div
class="highlight"><pre><span></span><span class="c1">// Java snippet</span>
-<span class="n">Module</span><span class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w">
</span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="na">getFunction</span><span
class="p">(</span><span class="s">"init"</span><span
class="p">).</span><span class="na">pushArg</span><span
class="p">(...).</span><span class="na">invoke</span>< [...]
-<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="na">getFunction</span><span
class="p">(</span><spa [...]
+<span class="n">Module</span><span class="w"> </span><span
class="n">vm</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span class="s">"l
[...]
+<span class="n">vm</span><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span
class="s">"init"</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(...).</span><span
class="na">invoke</span><span class="p">;</span>
+<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><span class="n">vm</span><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span
class="s">"prefill"</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(</span><span
class="n">data</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(</span><spa [...]
</pre></div>
</div>
</li>
diff --git a/docs/get_started/tutorials/sg_execution_times.html
b/docs/get_started/tutorials/sg_execution_times.html
index 9c71ef5ba2c..d914b09fa66 100644
--- a/docs/get_started/tutorials/sg_execution_times.html
+++ b/docs/get_started/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
<section id="computation-times">
<span
id="sphx-glr-get-started-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:06.237</strong> total execution time for 2 files <strong>from
get_started/tutorials</strong>:</p>
+<p><strong>00:07.509</strong> total execution time for 2 files <strong>from
get_started/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -319,11 +319,11 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
class="std std-ref">IRModule</span></a> (<code class="docutils literal
notranslate"><span class="pre">ir_module.py</span></code>)</p></td>
-<td><p>00:06.082</p></td>
+<td><p>00:07.352</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
class="std std-ref">Quick Start</span></a> (<code class="docutils literal
notranslate"><span class="pre">quick_start.py</span></code>)</p></td>
-<td><p>00:00.156</p></td>
+<td><p>00:00.157</p></td>
<td><p>0.0</p></td>
</tr>
</tbody>
diff --git a/docs/how_to/tutorials/bring_your_own_codegen.html
b/docs/how_to/tutorials/bring_your_own_codegen.html
index 91976b99c85..79d36c8340d 100644
--- a/docs/how_to/tutorials/bring_your_own_codegen.html
+++ b/docs/how_to/tutorials/bring_your_own_codegen.html
@@ -469,7 +469,7 @@ and <code class="docutils literal notranslate"><span
class="pre">USE_EXAMPLE_NPU
<span class="k">with</span> <a
href="../../reference/api/python/transform.html#tvm.ir.transform.PassContext"
title="tvm.ir.transform.PassContext"
class="sphx-glr-backref-module-tvm-ir-transform
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span
class="o">.</span><span class="n">transform</span><span class="o">.</span><span
class="n">PassContext</span></a><span class="p">(</span><span
class="n">opt_level</span><span class="o">=</span><span
class="mi">3</span><span class=" [...]
<span class="n">built</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.build"
title="tvm.relax.build" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-function"><span class="n">relax</span><span
class="o">.</span><span class="n">build</span></a><span class="p">(</span><span
class="n">mod</span><span class="p">,</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="s [...]
- <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">built</span><span class="p">,</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cp [...]
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">built</span><span class="p">,</span> <span class="n">tvm</span><span
class="o">.</span><span [...]
<span class="n">result</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">x_np</span><span class="p">,</span> <span class="n">tvm</span><span
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span
[...]
<span class="k">assert</span> <span class="n">result</span><span
class="o">.</span><span class="n">numpy</span><span class="p">()</span><span
class="o">.</span><span class="n">shape</span> <span class="o">==</span> <span
class="p">(</span><span class="mi">2</span><span class="p">,</span> <span
class="mi">8</span><span class="p">)</span>
@@ -509,7 +509,7 @@ priority), both ops are offloaded as a single composite
function.</p>
<span class="n">x2_np</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">randn</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span
class="mi">32</span><span class="p">)</span><span class="o">.</span><span
class="n">astype</span><span class="p">(</span><s [...]
<span class="n">w2_np</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">randn</span><span class="p">(</span><span
class="mi">16</span><span class="p">,</span> <span class="mi">3</span><span
class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span
class="mi">3</span><span class="p">)</span><span class="o">.</span><span
class="n">astype</span><span class="p">(</span><sp [...]
- <span class="n">vm2</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">built2</span><span class="p">,</span> <span
class="n">tvm</span><span class="o">.</span><span class="n"> [...]
+ <span class="n">vm2</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">built2</span><span class="p">,</span> <span class="n">tvm</span><span
class="o">.</span><spa [...]
<span class="n">result2</span> <span class="o">=</span> <span
class="n">vm2</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span>
<span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span
class="n">x2_np</span><span class="p">,</span> <span class="n">tvm</span><span
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class=" [...]
<span class="p">)</span>
diff --git a/docs/how_to/tutorials/cross_compilation_and_rpc.html
b/docs/how_to/tutorials/cross_compilation_and_rpc.html
index ec6e989356d..eb5d8fc8dcf 100644
--- a/docs/how_to/tutorials/cross_compilation_and_rpc.html
+++ b/docs/how_to/tutorials/cross_compilation_and_rpc.html
@@ -474,7 +474,7 @@ device and returns the measured cost. Network overhead is
excluded.</p>
<span class="nb">print</span><span class="p">(</span><span
class="sa">f</span><span class="s2">"</span><span class="si">{</span><span
class="n">cost</span><span class="si">:</span><span class="s2">g</span><span
class="si">}</span><span class="s2"> secs/op"</span><span
class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>9.7e-08 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>9.3e-08 secs/op
</pre></div>
</div>
<dl>
@@ -990,8 +990,8 @@ for ONNX models. Simply replace <code class="docutils
literal notranslate"><span
Converted PyTorch model to Relax:
- Number of parameters: 4
Using local target for demonstration
-Exported library to: /tmp/tmpj7lc8xaw/model_deployed.so
-Saved parameters to: /tmp/tmpj7lc8xaw/model_params.npz
+Exported library to: /tmp/tmpiw5zwg5w/model_deployed.so
+Saved parameters to: /tmp/tmpiw5zwg5w/model_params.npz
RPC workflow (works for any remote device):
==================================================
diff --git a/docs/how_to/tutorials/customize_opt.html
b/docs/how_to/tutorials/customize_opt.html
index 3f83ba34d22..fa3cf221600 100644
--- a/docs/how_to/tutorials/customize_opt.html
+++ b/docs/how_to/tutorials/customize_opt.html
@@ -613,16 +613,16 @@ pushing the performance to the limit. The current
optimization may not be the be
<p>We can build and deploy the optimized model to the TVM runtime.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-functi [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-ba [...]
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">np</span><span
class="o">.</span><span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"forward"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"forward"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a><span class="p">)</span [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">gpu_out</span><span class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[25360.273 25977.055 26176.404 26010.953
23686.574 24671.594 25639.94
- 26352.355 26981.152 24812.078]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[24617.4 24113.621 23790.748 25276.402
22978.486 24189.844 24161.701
+ 25012.629 24334.504 21885.848]]
</pre></div>
</div>
</section>
diff --git a/docs/how_to/tutorials/e2e_opt_model.html
b/docs/how_to/tutorials/e2e_opt_model.html
index 3118b49a968..239330ad45e 100644
--- a/docs/how_to/tutorials/e2e_opt_model.html
+++ b/docs/how_to/tutorials/e2e_opt_model.html
@@ -332,8 +332,8 @@ PyTorch.</p>
<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>Downloading:
"https://download.pytorch.org/models/resnet18-f37072fd.pth" to
/workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
0%| | 0.00/44.7M [00:00<?, ?B/s]
- 66%|██████▌ | 29.4M/44.7M [00:00<00:00, 308MB/s]
-100%|██████████| 44.7M/44.7M [00:00<00:00, 336MB/s]
+ 67%|██████▋ | 29.8M/44.7M [00:00<00:00, 311MB/s]
+100%|██████████| 44.7M/44.7M [00:00<00:00, 338MB/s]
</pre></div>
</div>
</section>
@@ -434,7 +434,7 @@ We skip this step in the CI environment.</p>
<span class="n">mod</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">s_tir</span><span
class="o">.</span><span class="n">transform</span><span class="o">.</span><span
class="n">DefaultGPUSchedule</span><span class="p">()(</span><span
class="n">mod</span><span class="p">)</span>
<span class="n">ex</span> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span
class="n">compile</span></a><span class="p">(</span><span
class="n">mod</span><span class="p">,</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
- <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">ex</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">ex</span><span class="p">,</span> <span class="n">dev</span><span
class="p">)</span>
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">gpu_data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span
class="p">,</span> <span class="mi">224< [...]
<span class="n">gpu_params</span> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span
class="p">,</span> <span class="n">dev</span><span class="p">)</span> <span
class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span
class="n">params</span><span class="p">[</span><span class="s2" [...]
diff --git a/docs/how_to/tutorials/export_and_load_executable.html
b/docs/how_to/tutorials/export_and_load_executable.html
index 9f6dd6da4f7..7374773bb04 100644
--- a/docs/how_to/tutorials/export_and_load_executable.html
+++ b/docs/how_to/tutorials/export_and_load_executable.html
@@ -446,7 +446,7 @@ runtime module directly.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="k">if</span> <a
href="https://docs.python.org/3/library/functions.html#bool"
title="builtins.bool" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">RUN_EXAMPLE</span></a><span class="p">:</span>
<span class="n">loaded_rt_mod</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">load_module</span><span
class="p">(</span><span class="nb">str</span><span class="p">(</span><span
class="n">library_path</span><span class="p">))</span>
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">(</span><span class="mi">0</span><span class="p">)</span>
- <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">loaded_rt_mod</span><span class="p">,</span>
<span class="n">dev</span><span class="p">)</span>
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">loaded_rt_mod</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
<span class="c1"># Prepare input data</span>
<span class="n">input_tensor</span> <span class="o">=</span> <span
class="n">torch</span><span class="o">.</span><span class="n">randn</span><span
class="p">(</span><span class="mi">1</span><span class="p">,</span> <span
class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span
class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span
class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span
class="o">.</span><span class="n">f [...]
@@ -527,7 +527,7 @@ of how to reload and run the model. Save this as <code
class="docutils literal n
<span class="c1"># Step 2: Create Virtual Machine</span>
<span class="n">device</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span
class="n">device</span><span class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span
class="p">)</span>
<span class="c1"># Step 3: Load parameters from the .npz file</span>
<span class="n">params_npz</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">load</span><span
class="p">(</span><span
class="s2">"relax_export_artifacts/model_params.npz"</span><span
class="p">)</span>
@@ -562,7 +562,7 @@ To run on GPU instead of CPU, make the following
changes:</p>
</li>
<li><p><strong>Use GPU device in the script</strong>:</p>
<div class="highlight-python notranslate"><div
class="highlight"><pre><span></span><span class="n">device</span> <span
class="o">=</span> <span class="n">tvm</span><span class="o">.</span><span
class="n">cuda</span><span class="p">(</span><span class="mi">0</span><span
class="p">)</span> <span class="c1"># Use CUDA device instead of CPU</span>
-<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span
class="n">device</span><span class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span
class="p">)</span>
<span class="c1"># Load parameters to GPU</span>
<span class="n">params</span> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span
class="n">params_npz</span><span class="p">[</span><span
class="sa">f</span><span class="s2">"p_</span><span
class="si">{</span><span class="n">i</span><span class="si">}</span><span
class="s2">"</span><span class="p">],</span> <span class= [...]
@@ -625,7 +625,7 @@ for a comprehensive guide on:</p>
<span class="c1"># Step 4: Load and run on remote device</span>
<span class="n">lib</span> <span class="o">=</span> <span
class="n">remote</span><span class="o">.</span><span
class="n">load_module</span><span class="p">(</span><span
class="s2">"mlp_arm.so"</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span
class="n">remote</span><span class="o">.</span><span class="n">cpu</ [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">lib</span><span class="p">,</span> <span class="n">remote</span><span
class="o">.</span><span cla [...]
<span class="c1"># ... prepare input and params, then run inference</span>
</pre></div>
</div>
diff --git a/docs/how_to/tutorials/import_model.html
b/docs/how_to/tutorials/import_model.html
index 3d89cb7fb29..bf683b39989 100644
--- a/docs/how_to/tutorials/import_model.html
+++ b/docs/how_to/tutorials/import_model.html
@@ -530,13 +530,13 @@ shown below.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="n">mod_compiled</span> <span
class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.get_pipeline"
title="tvm.relax.get_pipeline" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-function"><span class="n">relax</span><span
class="o">.</span><span class="n">get_pipeline</span></a><span
class="p">(</span><span class="s2">"zero"</span><span cla [...]
<a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">exec_module</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span class="n">comp [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">()</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="c1"># Run inference</span>
<span class="n">input_data</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span
class="mi">32</span><span class="p">)</span><span class="o">.</span><span
class="n">astype</span><span class="p">(</span><s [...]
<span class="n">tvm_input</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">input_data</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">tvm_params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span
class="p">,</span> <span class="n"> [...]
-<span class="n">tvm_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"main"</span><span
class="p">](</span><span class="n">tvm_input</span><span class="p">,</span>
<span class="o">*</span><a href="htt [...]
+<span class="n">tvm_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">tvm_input</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">tvm_params</span></a><span class="p">)</sp [...]
<span class="c1"># Compare with PyTorch</span>
<span class="k">with</span> <span class="n">torch</span><span
class="o">.</span><span class="n">no_grad</span><span class="p">():</span>
diff --git a/docs/how_to/tutorials/optimize_llm.html
b/docs/how_to/tutorials/optimize_llm.html
index 28404dde3d2..7cec203820f 100644
--- a/docs/how_to/tutorials/optimize_llm.html
+++ b/docs/how_to/tutorials/optimize_llm.html
@@ -728,7 +728,7 @@ is designed specifically for the LLMs.</p>
<span class="k">with</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">target</span></a><span class="p">:</span>
<a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span class="n">compile</ [...]
- <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span cla [...]
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" c [...]
</pre></div>
</div>
</section>
@@ -826,7 +826,7 @@ the model documentation for the correct tokenization and
prompt format.</p>
key and value tensors for the attention layer. Apache TVM provides a
PagedKVCache to store the
key and value tensors. We create the PagedKVCache with the specified
parameters.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="k">if</span> <span
class="ow">not</span> <a
href="https://docs.python.org/3/library/functions.html#bool"
title="builtins.bool" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">IS_IN_CI</span></a><span class="p">:</span>
- <span class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span
class="s2">"create_tir_paged_kv_cache"</span><span class="p">](</span>
+ <span class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"create_tir_paged_kv_cache"</span><span class="p">](</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="mi">1</span><span class="p">]),</span> <span
class="c1"># max_batch_size=1</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>
<span class="c1"># max_total_seq_len=2048</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>
<span class="c1"># prefill_chunk_size=2048</span>
@@ -843,7 +843,7 @@ compiled in the Relax IRModule to embed the tokens into the
hidden states.</p>
<span class="k">def</span><span class="w"> </span><span
class="nf">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">):</span>
- <span class="n">_embed</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"embed"</span><span
class="p">](</span><span class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span [...]
+ <span class="n">_embed</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"embed"</span><span class="p">](</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="c1"># Reshape hidden from [seq_len, hidden_size] to [1,
seq_len, hidden_size]</span>
<span class="n">_embed</span> <span class="o">=</span> <span
class="n">nd_view_func</span><span class="p">(</span><span
class="n">_embed</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="mi">1</span><span class="p">,</span> <span
class="n">_embed</span><span class="o">.</span>< [...]
<span class="k">return</span> <span class="n">_embed</span>
@@ -866,7 +866,7 @@ and <cite>end_forward_func</cite> to end the forward
pass.</p>
<span class="n">add_sequence_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">seq_id</span><span class="p">)</span>
<span class="n">hidden_states</span> <span class="o">=</span> <span
class="n">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">begin_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="s [...]
- <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"prefill"</span><span
class="p">](</span><span class="n">hidden_states</ [...]
+ <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"prefill"</span><span class="p">](</span><span
class="n">hidden_states</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">end_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">)</span>
</pre></div>
</div>
@@ -898,7 +898,7 @@ IRModule to generate the token.</p>
<span class="n">tokens</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">array</span><span
class="p">([</span><span class="n">last_token</span><span
class="p">])</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"int32"< [...]
<span class="n">hidden_states</span> <span class="o">=</span> <span
class="n">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">begin_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" clas [...]
- <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"decode"</span><span
class="p">](</span><span class="n">hidden_state [...]
+ <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"decode"</span><span class="p">](</span><span
class="n">hidden_states</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">end_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">)</span>
<span class="n">last_token</span> <span class="o">=</span> <span
class="n">sample_token</span><span class="p">(</span><span
class="n">logits</span><span class="p">)</span>
diff --git a/docs/how_to/tutorials/sg_execution_times.html
b/docs/how_to/tutorials/sg_execution_times.html
index 08aa0b128b3..63504becdc3 100644
--- a/docs/how_to/tutorials/sg_execution_times.html
+++ b/docs/how_to/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
<section id="computation-times">
<span id="sphx-glr-how-to-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:15.518</strong> total execution time for 8 files <strong>from
how_to/tutorials</strong>:</p>
+<p><strong>00:18.827</strong> total execution time for 8 files <strong>from
how_to/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -319,27 +319,27 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
class="std std-ref">Optimize Large Language Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.311</p></td>
+<td><p>00:10.447</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span
class="std std-ref">Importing Models from ML Frameworks</span></a> (<code
class="docutils literal notranslate"><span
class="pre">import_model.py</span></code>)</p></td>
-<td><p>00:03.353</p></td>
+<td><p>00:06.478</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span class="pre">customize_opt.py</span></code>)</p></td>
-<td><p>00:00.745</p></td>
+<td><p>00:00.712</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code class="docutils
literal notranslate"><span class="pre">e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.584</p></td>
+<td><p>00:00.665</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.479</p></td>
+<td><p>00:00.476</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a>
(<code class="docutils literal notranslate"><span
class="pre">bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.040</p></td>
+<td><p>00:00.042</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="mix_python_and_tvm_with_pymodule.html#sphx-glr-how-to-tutorials-mix-python-and-tvm-with-pymodule-py"><span
class="std std-ref">Mix Python/PyTorch with TVM Using BasePyModule</span></a>
(<code class="docutils literal notranslate"><span
class="pre">mix_python_and_tvm_with_pymodule.py</span></code>)</p></td>
diff --git a/docs/objects.inv b/docs/objects.inv
index 8cedefd68df..54ee948d55e 100644
Binary files a/docs/objects.inv and b/docs/objects.inv differ
diff --git a/docs/reference/api/python/relax/op.html
b/docs/reference/api/python/relax/op.html
index 7c0a739340f..f3e4f9782a2 100644
--- a/docs/reference/api/python/relax/op.html
+++ b/docs/reference/api/python/relax/op.html
@@ -2526,7 +2526,7 @@ Defaults to “ij”.</p></li>
<dl class="py function">
<dt class="sig sig-object py" id="tvm.relax.op.layout_transform">
-<span class="sig-prename descclassname"><span
class="pre">tvm.relax.op.</span></span><span class="sig-name descname"><span
class="pre">layout_transform</span></span><span class="sig-paren">(</span><em
class="sig-param"><span class="n"><span class="pre">x</span></span><span
class="p"><span class="pre">:</span></span><span class="w"> </span><span
class="n"><a class="reference internal" href="../ir.html#tvm.ir.RelaxExpr"
title="tvm.ir.expr.RelaxExpr"><span class="pre">RelaxExpr</span></a></ [...]
+<span class="sig-prename descclassname"><span
class="pre">tvm.relax.op.</span></span><span class="sig-name descname"><span
class="pre">layout_transform</span></span><span class="sig-paren">(</span><em
class="sig-param"><span class="n"><span class="pre">x</span></span><span
class="p"><span class="pre">:</span></span><span class="w"> </span><span
class="n"><a class="reference internal" href="../ir.html#tvm.ir.RelaxExpr"
title="tvm.ir.expr.RelaxExpr"><span class="pre">RelaxExpr</span></a></ [...]
<dd><p>Modifies the layout of a tensor.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
diff --git a/docs/reference/api/python/relax/training.html
b/docs/reference/api/python/relax/training.html
index 986b684f71a..a170f682326 100644
--- a/docs/reference/api/python/relax/training.html
+++ b/docs/reference/api/python/relax/training.html
@@ -435,7 +435,7 @@ relax.transform.AppendLoss.</p></li>
<dl class="py class">
<dt class="sig sig-object py" id="tvm.relax.training.Trainer">
-<em class="property"><span class="pre">class</span><span class="w">
</span></em><span class="sig-prename descclassname"><span
class="pre">tvm.relax.training.</span></span><span class="sig-name
descname"><span class="pre">Trainer</span></span><span
class="sig-paren">(</span><em class="sig-param"><span class="n"><span
class="pre">train_mod</span></span><span class="p"><span
class="pre">:</span></span><span class="w"> </span><span class="n"><a
class="reference internal" href="../ir.html#tvm [...]
+<em class="property"><span class="pre">class</span><span class="w">
</span></em><span class="sig-prename descclassname"><span
class="pre">tvm.relax.training.</span></span><span class="sig-name
descname"><span class="pre">Trainer</span></span><span
class="sig-paren">(</span><em class="sig-param"><span class="n"><span
class="pre">train_mod</span></span><span class="p"><span
class="pre">:</span></span><span class="w"> </span><span class="n"><a
class="reference internal" href="../ir.html#tvm [...]
<dd><p>Unified wrapper for relax training. It accepts the IRModule (that is
the result of
SetupTrainer) and the relax VM (that contains the built result of the
IRModule), and helps run
the VM. It maintains the parameters, the model states and the optimizer states
internally.</p>
diff --git a/docs/reference/api/python/runtime/vm.html
b/docs/reference/api/python/runtime/vm.html
index 9cee29db9f2..48f212af53c 100644
--- a/docs/reference/api/python/runtime/vm.html
+++ b/docs/reference/api/python/runtime/vm.html
@@ -509,7 +509,7 @@ more details.</p>
<div class="admonition seealso">
<p class="admonition-title">See also</p>
<dl class="simple">
-<dt><a class="reference internal"
href="#tvm.runtime.vm.VMInstrumentReturnKind"
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj
docutils literal notranslate"><span
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible
return values in VM.</p>
+<dt><a class="reference internal"
href="../relax/relax.html#tvm.relax.VMInstrumentReturnKind"
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj
docutils literal notranslate"><span
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible
return values in VM.</p>
</dd>
</dl>
</div>
diff --git a/docs/searchindex.js b/docs/searchindex.js
index 8909d8216c4..beb3d7b3fec 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"API Reference": [[57, null]], "About": [[57,
null]], "Adding a New Backend": [[2, "adding-a-new-backend"]], "Adding a New
Docker Image": [[11, "adding-a-new-docker-image"]], "Additional
Recommendations": [[13, "additional-recommendations"]], "Advanced Build
Configuration": [[59, "advanced-build-configuration"]], "Algorithm": [[3,
"algorithm"]], "Alternative PyTorch import methods": [[52,
"alternative-pytorch-import-methods"]], "Apache TVM Documentation": [ [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"API Reference": [[57, null]], "About": [[57,
null]], "Adding a New Backend": [[2, "adding-a-new-backend"]], "Adding a New
Docker Image": [[11, "adding-a-new-docker-image"]], "Additional
Recommendations": [[13, "additional-recommendations"]], "Advanced Build
Configuration": [[59, "advanced-build-configuration"]], "Algorithm": [[3,
"algorithm"]], "Alternative PyTorch import methods": [[52,
"alternative-pytorch-import-methods"]], "Apache TVM Documentation": [ [...]
\ No newline at end of file
diff --git a/docs/sg_execution_times.html b/docs/sg_execution_times.html
index 1b3f031880f..ef03b1b7a70 100644
--- a/docs/sg_execution_times.html
+++ b/docs/sg_execution_times.html
@@ -297,7 +297,7 @@
<section id="computation-times">
<span id="sphx-glr-sg-execution-times"></span><h1>Computation times<a
class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:22.663</strong> total execution time for 16 files <strong>from
all galleries</strong>:</p>
+<p><strong>00:27.255</strong> total execution time for 16 files <strong>from
all galleries</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -319,47 +319,47 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
class="std std-ref">Optimize Large Language Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.311</p></td>
+<td><p>00:10.447</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="get_started/tutorials/ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
class="std std-ref">IRModule</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../get_started/tutorials/ir_module.py</span></code>)</p></td>
-<td><p>00:06.082</p></td>
+<td><p>00:07.352</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span
class="std std-ref">Importing Models from ML Frameworks</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/import_model.py</span></code>)</p></td>
-<td><p>00:03.353</p></td>
+<td><p>00:06.478</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="how_to/tutorials/customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span
class="pre">../how_to/tutorials/customize_opt.py</span></code>)</p></td>
-<td><p>00:00.745</p></td>
+<td><p>00:00.712</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.584</p></td>
+<td><p>00:00.665</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="how_to/tutorials/cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.479</p></td>
+<td><p>00:00.476</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.296</p></td>
+<td><p>00:00.301</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
class="std std-ref">TensorIR Creation</span></a> (<code class="docutils
literal notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/tir_creation.py</span></code>)</p></td>
-<td><p>00:00.278</p></td>
+<td><p>00:00.282</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="get_started/tutorials/quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
class="std std-ref">Quick Start</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../get_started/tutorials/quick_start.py</span></code>)</p></td>
-<td><p>00:00.156</p></td>
+<td><p>00:00.157</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/relax/tutorials/relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
class="std std-ref">Relax Creation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/relax/tutorials/relax_creation.py</span></code>)</p></td>
-<td><p>00:00.153</p></td>
+<td><p>00:00.156</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.118</p></td>
+<td><p>00:00.119</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/relax/tutorials/relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/relax/tutorials/relax_transformation.py</span></code>)</p></td>
@@ -367,7 +367,7 @@ $(document).ready( function () {
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a>
(<code class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.040</p></td>
+<td><p>00:00.042</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/meta_schedule.py</span></code>)</p></td>