This is an automated email from the ASF dual-hosted git repository.
tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 9d95b887c8 deploying docs
(apache/tvm@15ac9dbc73a4950f1e353691f2d8624f18ee8b90)
9d95b887c8 is described below
commit 9d95b887c865a66a8deb6f74d20a6a079860be80
Author: tvm-bot <[email protected]>
AuthorDate: Wed Jan 21 02:52:43 2026 +0000
deploying docs (apache/tvm@15ac9dbc73a4950f1e353691f2d8624f18ee8b90)
---
.../11c11e53c7dace51a8be968ee169ed0d/ir_module.zip | Bin 23904 -> 23904 bytes
.../tir_transformation.zip | Bin 15599 -> 15599 bytes
.../relax_creation.zip | Bin 22392 -> 22392 bytes
.../relax_transformation.zip | Bin 11460 -> 11460 bytes
.../optimize_llm.zip | Bin 54101 -> 54101 bytes
.../e2e_opt_model.zip | Bin 14486 -> 14486 bytes
.../quick_start.zip | Bin 16250 -> 16250 bytes
.../export_and_load_executable.zip | Bin 31428 -> 31428 bytes
.../tir_creation.zip | Bin 24379 -> 24379 bytes
.../cross_compilation_and_rpc.zip | Bin 46605 -> 46605 bytes
.../customize_opt.zip | Bin 19813 -> 19813 bytes
.../relax/tutorials/sg_execution_times.rst.txt | 6 +--
.../tensor_ir/tutorials/sg_execution_times.rst.txt | 6 +--
.../tensor_ir/tutorials/tir_creation.rst.txt | 20 +++++-----
.../tensor_ir/tutorials/tir_transformation.rst.txt | 6 +--
.../get_started/tutorials/ir_module.rst.txt | 8 ++--
.../get_started/tutorials/quick_start.rst.txt | 4 +-
.../tutorials/sg_execution_times.rst.txt | 6 +--
.../tutorials/cross_compilation_and_rpc.rst.txt | 6 +--
.../how_to/tutorials/customize_opt.rst.txt | 4 +-
.../how_to/tutorials/e2e_opt_model.rst.txt | 2 +-
.../how_to/tutorials/sg_execution_times.rst.txt | 14 +++----
docs/_sources/sg_execution_times.rst.txt | 26 ++++++------
docs/deep_dive/relax/tutorials/relax_creation.html | 16 +++++++-
.../relax/tutorials/relax_transformation.html | 15 ++++++-
.../relax/tutorials/sg_execution_times.html | 6 +--
.../tensor_ir/tutorials/sg_execution_times.html | 6 +--
.../tensor_ir/tutorials/tir_creation.html | 44 ++++++---------------
.../tensor_ir/tutorials/tir_transformation.html | 23 +++--------
docs/get_started/tutorials/ir_module.html | 16 ++++----
docs/get_started/tutorials/quick_start.html | 24 +++++------
docs/get_started/tutorials/sg_execution_times.html | 6 +--
.../tutorials/cross_compilation_and_rpc.html | 6 +--
docs/how_to/tutorials/customize_opt.html | 8 ++--
docs/how_to/tutorials/e2e_opt_model.html | 11 ++++--
.../tutorials/export_and_load_executable.html | 6 +--
docs/how_to/tutorials/optimize_llm.html | 10 ++---
docs/how_to/tutorials/sg_execution_times.html | 16 ++++----
docs/objects.inv | Bin 19910 -> 19918 bytes
docs/reference/api/python/runtime/vm.html | 2 +-
docs/searchindex.js | 2 +-
docs/sg_execution_times.html | 28 ++++++-------
42 files changed, 174 insertions(+), 179 deletions(-)
diff --git a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
index 0c1f8629e5..5e7971eb16 100644
Binary files a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
and b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip differ
diff --git
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
index 499d87404c..14dcb517f1 100644
Binary files
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip and
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip differ
diff --git
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
index 3bc4bf0c53..4f9e0ff507 100644
Binary files
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip and
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip differ
diff --git
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
index 6f57904096..68f314fa56 100644
Binary files
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip and
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
differ
diff --git a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
index f8187f6b5d..34327d34e9 100644
Binary files
a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip and
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip differ
diff --git a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
index ff0db26542..9613be57cd 100644
Binary files
a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip and
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip differ
diff --git a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
index 131c63962c..ee923ff687 100644
Binary files a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
and b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip differ
diff --git
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
index 5140b8025a..40dff75568 100644
Binary files
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
and
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
differ
diff --git a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
index 8010e43426..a6b1a38132 100644
Binary files
a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip and
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip differ
diff --git
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
index f5c2c3007c..09abcd1ea7 100644
Binary files
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
and
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
differ
diff --git a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
index 5a983e631d..16990396e0 100644
Binary files
a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip and
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip differ
diff --git a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
index 2965302a3b..9c8bef6a20 100644
--- a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:00.174** total execution time for 2 files **from
deep_dive/relax/tutorials**:
+**00:00.166** total execution time for 2 files **from
deep_dive/relax/tutorials**:
.. container::
@@ -33,8 +33,8 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py`
(``relax_creation.py``)
- - 00:00.110
+ - 00:00.104
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py`
(``relax_transformation.py``)
- - 00:00.064
+ - 00:00.062
- 0.0
diff --git
a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
index e70b662edd..0b3e7ee9df 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:00.461** total execution time for 2 files **from
deep_dive/tensor_ir/tutorials**:
+**00:00.459** total execution time for 2 files **from
deep_dive/tensor_ir/tutorials**:
.. container::
@@ -33,8 +33,8 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py`
(``tir_transformation.py``)
- - 00:00.290
+ - 00:00.292
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py`
(``tir_creation.py``)
- - 00:00.170
+ - 00:00.168
- 0.0
diff --git a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
index 046c512580..7a942d617e 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
@@ -319,17 +319,17 @@ Now let's check the runtime dynamic shape inference:
.. code-block:: none
- [[1.7002733 0.97172225 1.297857 0.74689716]
- [1.2288823 0.7683435 0.8796271 0.6174458 ]
- [1.2584145 1.0032675 0.72042626 0.8052738 ]
- [1.8133624 0.96875834 1.2889159 1.3091007 ]]
- [[28.82039 30.19772 35.0464 ... 32.4889 30.753525 27.724426]
- [29.603863 30.972141 35.482548 ... 33.074383 33.304195 32.494915]
- [28.104427 30.18853 31.767584 ... 31.079697 30.035767 29.23242 ]
+ [[1.4768867 0.63970053 1.2864859 1.6733968 ]
+ [1.0805621 0.54556584 0.7396009 0.88587785]
+ [0.8411211 0.30300215 1.1105069 1.5710119 ]
+ [1.5479637 0.6287384 1.2951206 1.7025402 ]]
+ [[32.268753 34.077393 32.4796 ... 35.35778 34.28251 31.377125]
+ [33.914585 35.647198 34.30574 ... 34.140274 34.370613 35.974476]
+ [32.408855 34.634727 31.485485 ... 34.611935 34.722504 32.173973]
...
- [29.475782 31.569359 33.71356 ... 32.6794 32.380157 28.54605 ]
- [31.695753 32.113583 37.301548 ... 33.536453 34.079506 31.707306]
- [29.382877 30.93897 37.715393 ... 32.47629 33.64705 31.184927]]
+ [30.049591 32.61384 32.59786 ... 34.617733 32.099743 31.523752]
+ [33.170135 35.919205 31.360624 ... 34.02011 33.35957 32.457687]
+ [29.723171 31.543198 29.972626 ... 33.52555 33.270355 32.935043]]
diff --git
a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
index ca25e872a7..3a3b28e6f7 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
@@ -117,7 +117,7 @@ original implementation.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 2.7569 2.7569 2.7569 2.7569 0.0000
+ 2.7296 2.7296 2.7296 2.7296 0.0000
@@ -289,7 +289,7 @@ action involves reordering these two loops.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.8602 0.8602 0.8602 0.8602 0.0000
+ 0.8592 0.8592 0.8592 0.8592 0.0000
@@ -417,7 +417,7 @@ from the reduction update via the **decompose_reduction**
primitive.
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.3371 0.3371 0.3371 0.3371 0.0000
+ 0.3356 0.3356 0.3356 0.3356 0.0000
diff --git a/docs/_sources/get_started/tutorials/ir_module.rst.txt
b/docs/_sources/get_started/tutorials/ir_module.rst.txt
index 687c12381b..881c2b8c58 100644
--- a/docs/_sources/get_started/tutorials/ir_module.rst.txt
+++ b/docs/_sources/get_started/tutorials/ir_module.rst.txt
@@ -692,8 +692,8 @@ We can deploy the IRModule on CPU by specifying the target
as ``llvm``.
.. code-block:: none
- [[-0.00813761 0.17235452 0.13928246 -0.19248219 -0.05469553 0.23520306
- 0.05647281 0.01713389 0.0731992 0.06497046]]
+ [[-0.1516748 0.06588852 -0.21453422 -0.06856849 0.06713285 0.0969781
+ -0.08928693 -0.01189575 0.02223392 -0.01426557]]
@@ -759,8 +759,8 @@ Now we can compile the IRModule on GPU, the similar way as
we did on CPU.
.. code-block:: none
- [[-0.00813762 0.17235449 0.13928246 -0.19248216 -0.05469557 0.23520309
- 0.05647276 0.01713393 0.07319915 0.06497044]]
+ [[-0.15167476 0.06588843 -0.21453409 -0.06856853 0.06713282 0.09697811
+ -0.08928692 -0.01189569 0.02223385 -0.0142656 ]]
diff --git a/docs/_sources/get_started/tutorials/quick_start.rst.txt
b/docs/_sources/get_started/tutorials/quick_start.rst.txt
index 7f0cb65bdf..efd9f615ba 100644
--- a/docs/_sources/get_started/tutorials/quick_start.rst.txt
+++ b/docs/_sources/get_started/tutorials/quick_start.rst.txt
@@ -224,8 +224,8 @@ different devices.
.. code-block:: none
- [[24545.648 27244.729 24525.357 26602.074 25662.295 25989.164 22992.693
- 23667.443 26247.26 23540.232]]
+ [[24854.854 25747.064 25267.877 25039.768 25553.775 24642.68 25147.45
+ 25204.746 25647.812 24801.225]]
diff --git a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
index 20d99020be..818b7c9a92 100644
--- a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:05.376** total execution time for 2 files **from get_started/tutorials**:
+**00:07.513** total execution time for 2 files **from get_started/tutorials**:
.. container::
@@ -33,8 +33,8 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_get_started_tutorials_ir_module.py` (``ir_module.py``)
- - 00:05.205
+ - 00:07.343
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_quick_start.py`
(``quick_start.py``)
- - 00:00.171
+ - 00:00.170
- 0.0
diff --git a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
index 57f7d609b0..76a171736a 100644
--- a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
@@ -268,7 +268,7 @@ device and returns the measured cost. Network overhead is
excluded.
.. code-block:: none
- 1.28e-07 secs/op
+ 1.25e-07 secs/op
@@ -651,8 +651,8 @@ This workflow is applicable to various deployment scenarios:
Converted PyTorch model to Relax:
- Number of parameters: 4
Using local target for demonstration
- Exported library to: /tmp/tmplotr3l0s/model_deployed.so
- Saved parameters to: /tmp/tmplotr3l0s/model_params.npz
+ Exported library to: /tmp/tmp_fauril4/model_deployed.so
+ Saved parameters to: /tmp/tmp_fauril4/model_params.npz
RPC workflow (works for any remote device):
==================================================
diff --git a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
index 6cec35f8b9..b098e75afa 100644
--- a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
+++ b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
@@ -414,8 +414,8 @@ We can build and deploy the optimized model to the TVM
runtime.
.. code-block:: none
- [[26702.004 26262.46 24914.695 24891.164 25470.867 25811.766 25893.855
- 24336.852 26334.227 24946.672]]
+ [[23805.832 25195.34 24493.363 24099.348 25487.43 25607.45 24746.297
+ 25004.492 25226.61 24738.535]]
diff --git a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
index c5d924b62d..68af30f3fb 100644
--- a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
+++ b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
@@ -53,7 +53,7 @@ PyTorch.
.. code-block:: none
Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth"
to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
- 0%| | 0.00/44.7M [00:00<?, ?B/s] 47%|████▋ |
21.1M/44.7M [00:00<00:00, 221MB/s] 100%|██████████| 44.7M/44.7M
[00:00<00:00, 261MB/s]
+ 0%| | 0.00/44.7M [00:00<?, ?B/s] 9%|▉ |
4.12M/44.7M [00:00<00:01, 42.5MB/s] 22%|██▏ | 9.75M/44.7M
[00:00<00:00, 52.1MB/s] 33%|███▎ | 14.8M/44.7M [00:00<00:00,
35.4MB/s] 50%|████▉ | 22.1M/44.7M [00:00<00:00, 48.1MB/s]
65%|██████▌ | 29.2M/44.7M [00:00<00:00, 55.2MB/s] 86%|████████▌ |
38.4M/44.7M [00:00<00:00, 67.3MB/s] 100%|██████████| 44.7M/44.7M
[00:00<00:00, 61.8MB/s]
diff --git a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
index c29e4d6467..88da65504a 100644
--- a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:32.273** total execution time for 5 files **from how_to/tutorials**:
+**00:33.484** total execution time for 5 files **from how_to/tutorials**:
.. container::
@@ -33,16 +33,16 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` (``optimize_llm.py``)
- - 00:30.304
+ - 00:29.694
+ - 0.0
+ * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` (``e2e_opt_model.py``)
+ - 00:02.245
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py`
(``cross_compilation_and_rpc.py``)
- - 00:00.790
+ - 00:00.800
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` (``customize_opt.py``)
- - 00:00.680
- - 0.0
- * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` (``e2e_opt_model.py``)
- - 00:00.497
+ - 00:00.743
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_export_and_load_executable.py`
(``export_and_load_executable.py``)
- 00:00.002
diff --git a/docs/_sources/sg_execution_times.rst.txt
b/docs/_sources/sg_execution_times.rst.txt
index 27fcaf54eb..1c07fed56a 100644
--- a/docs/_sources/sg_execution_times.rst.txt
+++ b/docs/_sources/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
Computation times
=================
-**00:38.284** total execution time for 11 files **from all galleries**:
+**00:41.623** total execution time for 11 files **from all galleries**:
.. container::
@@ -33,34 +33,34 @@ Computation times
- Time
- Mem (MB)
* - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py`
(``../how_to/tutorials/optimize_llm.py``)
- - 00:30.304
+ - 00:29.694
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_ir_module.py`
(``../get_started/tutorials/ir_module.py``)
- - 00:05.205
+ - 00:07.343
+ - 0.0
+ * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py`
(``../how_to/tutorials/e2e_opt_model.py``)
+ - 00:02.245
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py`
(``../how_to/tutorials/cross_compilation_and_rpc.py``)
- - 00:00.790
+ - 00:00.800
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_customize_opt.py`
(``../how_to/tutorials/customize_opt.py``)
- - 00:00.680
- - 0.0
- * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py`
(``../how_to/tutorials/e2e_opt_model.py``)
- - 00:00.497
+ - 00:00.743
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py`
(``../deep_dive/tensor_ir/tutorials/tir_transformation.py``)
- - 00:00.290
+ - 00:00.292
- 0.0
* - :ref:`sphx_glr_get_started_tutorials_quick_start.py`
(``../get_started/tutorials/quick_start.py``)
- - 00:00.171
+ - 00:00.170
- 0.0
* - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py`
(``../deep_dive/tensor_ir/tutorials/tir_creation.py``)
- - 00:00.170
+ - 00:00.168
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py`
(``../deep_dive/relax/tutorials/relax_creation.py``)
- - 00:00.110
+ - 00:00.104
- 0.0
* - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py`
(``../deep_dive/relax/tutorials/relax_transformation.py``)
- - 00:00.064
+ - 00:00.062
- 0.0
* - :ref:`sphx_glr_how_to_tutorials_export_and_load_executable.py`
(``../how_to/tutorials/export_and_load_executable.py``)
- 00:00.002
diff --git a/docs/deep_dive/relax/tutorials/relax_creation.html
b/docs/deep_dive/relax/tutorials/relax_creation.html
index 8659f92973..46649177cd 100644
--- a/docs/deep_dive/relax/tutorials/relax_creation.html
+++ b/docs/deep_dive/relax/tutorials/relax_creation.html
@@ -192,10 +192,22 @@
<li class="toctree-l1"><a class="reference internal"
href="../../../how_to/dev/index.html">Development Guides</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Deep
Dive</span></p>
-<ul>
+<ul class="current">
<li class="toctree-l1"><a class="reference internal"
href="../../../arch/index.html">Design and Architecture</a></li>
<li class="toctree-l1"><a class="reference internal"
href="../../tensor_ir/index.html">TensorIR</a></li>
-<li class="toctree-l1"><a class="reference internal"
href="../index.html">Relax</a></li>
+<li class="toctree-l1 current"><a class="reference internal"
href="../index.html">Relax</a><ul class="current">
+<li class="toctree-l2"><a class="reference internal"
href="../abstraction.html">Graph Abstraction for ML Models</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../learning.html">Understand Relax Abstraction</a></li>
+<li class="toctree-l2 current"><a class="current reference internal"
href="#">Relax Creation</a><ul>
+<li class="toctree-l3"><a class="reference internal"
href="#create-relax-programs-using-tvmscript">Create Relax programs using
TVMScript</a></li>
+<li class="toctree-l3"><a class="reference internal"
href="#create-relax-programs-using-nnmodule-api">Create Relax programs using
NNModule API</a></li>
+<li class="toctree-l3"><a class="reference internal"
href="#create-relax-programs-using-block-builder-api">Create Relax programs
using Block Builder API</a></li>
+<li class="toctree-l3"><a class="reference internal"
href="#summary">Summary</a></li>
+</ul>
+</li>
+<li class="toctree-l2"><a class="reference internal"
href="relax_transformation.html">Transformation</a></li>
+</ul>
+</li>
</ul>
<p class="caption" role="heading"><span class="caption-text">API
Reference</span></p>
<ul>
diff --git a/docs/deep_dive/relax/tutorials/relax_transformation.html
b/docs/deep_dive/relax/tutorials/relax_transformation.html
index 4fb316cdcb..b635544ee3 100644
--- a/docs/deep_dive/relax/tutorials/relax_transformation.html
+++ b/docs/deep_dive/relax/tutorials/relax_transformation.html
@@ -192,10 +192,21 @@
<li class="toctree-l1"><a class="reference internal"
href="../../../how_to/dev/index.html">Development Guides</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Deep
Dive</span></p>
-<ul>
+<ul class="current">
<li class="toctree-l1"><a class="reference internal"
href="../../../arch/index.html">Design and Architecture</a></li>
<li class="toctree-l1"><a class="reference internal"
href="../../tensor_ir/index.html">TensorIR</a></li>
-<li class="toctree-l1"><a class="reference internal"
href="../index.html">Relax</a></li>
+<li class="toctree-l1 current"><a class="reference internal"
href="../index.html">Relax</a><ul class="current">
+<li class="toctree-l2"><a class="reference internal"
href="../abstraction.html">Graph Abstraction for ML Models</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="../learning.html">Understand Relax Abstraction</a></li>
+<li class="toctree-l2"><a class="reference internal"
href="relax_creation.html">Relax Creation</a></li>
+<li class="toctree-l2 current"><a class="current reference internal"
href="#">Transformation</a><ul>
+<li class="toctree-l3"><a class="reference internal"
href="#apply-transformations">Apply transformations</a></li>
+<li class="toctree-l3"><a class="reference internal"
href="#custom-passes">Custom Passes</a></li>
+<li class="toctree-l3"><a class="reference internal"
href="#summary">Summary</a></li>
+</ul>
+</li>
+</ul>
+</li>
</ul>
<p class="caption" role="heading"><span class="caption-text">API
Reference</span></p>
<ul>
diff --git a/docs/deep_dive/relax/tutorials/sg_execution_times.html
b/docs/deep_dive/relax/tutorials/sg_execution_times.html
index 8fd4ca9985..5a5f6e950d 100644
--- a/docs/deep_dive/relax/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/relax/tutorials/sg_execution_times.html
@@ -294,7 +294,7 @@
<section id="computation-times">
<span
id="sphx-glr-deep-dive-relax-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:00.174</strong> total execution time for 2 files <strong>from
deep_dive/relax/tutorials</strong>:</p>
+<p><strong>00:00.166</strong> total execution time for 2 files <strong>from
deep_dive/relax/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -316,11 +316,11 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
class="std std-ref">Relax Creation</span></a> (<code class="docutils literal
notranslate"><span class="pre">relax_creation.py</span></code>)</p></td>
-<td><p>00:00.110</p></td>
+<td><p>00:00.104</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span class="pre">relax_transformation.py</span></code>)</p></td>
-<td><p>00:00.064</p></td>
+<td><p>00:00.062</p></td>
<td><p>0.0</p></td>
</tr>
</tbody>
diff --git a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
index 70a90f21a9..534965b289 100644
--- a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
@@ -294,7 +294,7 @@
<section id="computation-times">
<span
id="sphx-glr-deep-dive-tensor-ir-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:00.461</strong> total execution time for 2 files <strong>from
deep_dive/tensor_ir/tutorials</strong>:</p>
+<p><strong>00:00.459</strong> total execution time for 2 files <strong>from
deep_dive/tensor_ir/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -316,11 +316,11 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span class="pre">tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.290</p></td>
+<td><p>00:00.292</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
class="std std-ref">TensorIR Creation</span></a> (<code class="docutils
literal notranslate"><span class="pre">tir_creation.py</span></code>)</p></td>
-<td><p>00:00.170</p></td>
+<td><p>00:00.168</p></td>
<td><p>0.0</p></td>
</tr>
</tbody>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
index 85962404a8..c7b85a353f 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
@@ -193,29 +193,9 @@
<li class="toctree-l1"><a class="reference internal"
href="../../../how_to/dev/index.html">Development Guides</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Deep
Dive</span></p>
-<ul class="current">
+<ul>
<li class="toctree-l1"><a class="reference internal"
href="../../../arch/index.html">Design and Architecture</a></li>
-<li class="toctree-l1 current"><a class="reference internal"
href="../index.html">TensorIR</a><ul class="current">
-<li class="toctree-l2"><a class="reference internal"
href="../abstraction.html">Tensor Program Abstraction</a></li>
-<li class="toctree-l2"><a class="reference internal"
href="../learning.html">Understand TensorIR Abstraction</a></li>
-<li class="toctree-l2 current"><a class="current reference internal"
href="#">TensorIR Creation</a><ul>
-<li class="toctree-l3"><a class="reference internal"
href="#create-tensorir-using-tvmscript">Create TensorIR using TVMScript</a><ul>
-<li class="toctree-l4"><a class="reference internal"
href="#standard-format">Standard Format</a></li>
-<li class="toctree-l4"><a class="reference internal"
href="#concise-with-syntactic-sugar">Concise with Syntactic Sugar</a></li>
-<li class="toctree-l4"><a class="reference internal"
href="#interactive-with-python-variables">Interactive with Python
Variables</a></li>
-<li class="toctree-l4"><a class="reference internal"
href="#tensorir-function-with-dynamic-shapes">TensorIR Function with Dynamic
Shapes</a></li>
-</ul>
-</li>
-<li class="toctree-l3"><a class="reference internal"
href="#create-tensorir-using-tensor-expression">Create TensorIR using Tensor
Expression</a><ul>
-<li class="toctree-l4"><a class="reference internal"
href="#create-static-shape-functions">Create Static-Shape Functions</a></li>
-<li class="toctree-l4"><a class="reference internal"
href="#create-dynamic-shape-functions">Create Dynamic-Shape Functions</a></li>
-</ul>
-</li>
-</ul>
-</li>
-<li class="toctree-l2"><a class="reference internal"
href="tir_transformation.html">Transformation</a></li>
-</ul>
-</li>
+<li class="toctree-l1"><a class="reference internal"
href="../index.html">TensorIR</a></li>
<li class="toctree-l1"><a class="reference internal"
href="../../relax/index.html">Relax</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">API
Reference</span></p>
@@ -509,17 +489,17 @@ be used to ascertain the shape and data type of a
TensorIR.</p>
<span class="nb">print</span><span class="p">(</span><span
class="n">evaluate_dynamic_shape</span><span class="p">(</span><span
class="n">dyn_shape_lib</span><span class="p">,</span> <span
class="n">m</span><span class="o">=</span><span class="mi">64</span><span
class="p">,</span> <span class="n">n</span><span class="o">=</span><span
class="mi">64</span><span class="p">,</span> <a
href="../../../reference/api/python/tir/tir.html#tvm.tir.IterVar"
title="tvm.tir.IterVar" class="sphx-glr-ba [...]
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[1.7002733 0.97172225 1.297857
0.74689716]
- [1.2288823 0.7683435 0.8796271 0.6174458 ]
- [1.2584145 1.0032675 0.72042626 0.8052738 ]
- [1.8133624 0.96875834 1.2889159 1.3091007 ]]
-[[28.82039 30.19772 35.0464 ... 32.4889 30.753525 27.724426]
- [29.603863 30.972141 35.482548 ... 33.074383 33.304195 32.494915]
- [28.104427 30.18853 31.767584 ... 31.079697 30.035767 29.23242 ]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[1.4768867 0.63970053 1.2864859
1.6733968 ]
+ [1.0805621 0.54556584 0.7396009 0.88587785]
+ [0.8411211 0.30300215 1.1105069 1.5710119 ]
+ [1.5479637 0.6287384 1.2951206 1.7025402 ]]
+[[32.268753 34.077393 32.4796 ... 35.35778 34.28251 31.377125]
+ [33.914585 35.647198 34.30574 ... 34.140274 34.370613 35.974476]
+ [32.408855 34.634727 31.485485 ... 34.611935 34.722504 32.173973]
...
- [29.475782 31.569359 33.71356 ... 32.6794 32.380157 28.54605 ]
- [31.695753 32.113583 37.301548 ... 33.536453 34.079506 31.707306]
- [29.382877 30.93897 37.715393 ... 32.47629 33.64705 31.184927]]
+ [30.049591 32.61384 32.59786 ... 34.617733 32.099743 31.523752]
+ [33.170135 35.919205 31.360624 ... 34.02011 33.35957 32.457687]
+ [29.723171 31.543198 29.972626 ... 33.52555 33.270355 32.935043]]
</pre></div>
</div>
</section>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
index 74ffaaba06..2ca9e6c3ac 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
@@ -192,22 +192,9 @@
<li class="toctree-l1"><a class="reference internal"
href="../../../how_to/dev/index.html">Development Guides</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">Deep
Dive</span></p>
-<ul class="current">
+<ul>
<li class="toctree-l1"><a class="reference internal"
href="../../../arch/index.html">Design and Architecture</a></li>
-<li class="toctree-l1 current"><a class="reference internal"
href="../index.html">TensorIR</a><ul class="current">
-<li class="toctree-l2"><a class="reference internal"
href="../abstraction.html">Tensor Program Abstraction</a></li>
-<li class="toctree-l2"><a class="reference internal"
href="../learning.html">Understand TensorIR Abstraction</a></li>
-<li class="toctree-l2"><a class="reference internal"
href="tir_creation.html">TensorIR Creation</a></li>
-<li class="toctree-l2 current"><a class="current reference internal"
href="#">Transformation</a><ul>
-<li class="toctree-l3"><a class="reference internal"
href="#initialization-schedule">Initialization Schedule</a></li>
-<li class="toctree-l3"><a class="reference internal" href="#loop-tiling">Loop
Tiling</a></li>
-<li class="toctree-l3"><a class="reference internal"
href="#leverage-localities">Leverage Localities</a></li>
-<li class="toctree-l3"><a class="reference internal"
href="#rewrite-reduction">Rewrite Reduction</a></li>
-<li class="toctree-l3"><a class="reference internal"
href="#trace-the-transformation">Trace the Transformation</a></li>
-</ul>
-</li>
-</ul>
-</li>
+<li class="toctree-l1"><a class="reference internal"
href="../index.html">TensorIR</a></li>
<li class="toctree-l1"><a class="reference internal"
href="../../relax/index.html">Relax</a></li>
</ul>
<p class="caption" role="heading"><span class="caption-text">API
Reference</span></p>
@@ -381,7 +368,7 @@ original implementation.</p>
</div>
<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 2.7569 2.7569 2.7569 2.7569 0.0000
+ 2.7296 2.7296 2.7296 2.7296 0.0000
</pre></div>
</div>
<section id="initialization-schedule">
@@ -477,7 +464,7 @@ class Module:
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.8602 0.8602 0.8602 0.8602 0.0000
+ 0.8592 0.8592 0.8592 0.8592 0.0000
</pre></div>
</div>
</section>
@@ -571,7 +558,7 @@ class Module:
Execution time summary:
mean (ms) median (ms) max (ms) min (ms) std (ms)
- 0.3371 0.3371 0.3371 0.3371 0.0000
+ 0.3356 0.3356 0.3356 0.3356 0.0000
</pre></div>
</div>
</section>
diff --git a/docs/get_started/tutorials/ir_module.html
b/docs/get_started/tutorials/ir_module.html
index d2497d897c..ff864ff53d 100644
--- a/docs/get_started/tutorials/ir_module.html
+++ b/docs/get_started/tutorials/ir_module.html
@@ -803,16 +803,16 @@ backends.</p>
<p>We can deploy the IRModule on CPU by specifying the target as <code
class="docutils literal notranslate"><span class="pre">llvm</span></code>.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">exec</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">()</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="n">raw_data</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span
class="p">)</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"float32"</span><span
class="p">)</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">raw_data</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
-<span class="n">cpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"main"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https:// [...]
+<span class="n">cpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#dict"
title="builtins.dict" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params_from_torch</span></a><span class="p">[</ [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">cpu_out</span><span class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.00813761 0.17235452 0.13928246
-0.19248219 -0.05469553 0.23520306
- 0.05647281 0.01713389 0.0731992 0.06497046]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.1516748 0.06588852 -0.21453422
-0.06856849 0.06713285 0.0969781
+ -0.08928693 -0.01189575 0.02223392 -0.01426557]]
</pre></div>
</div>
</section>
@@ -835,19 +835,19 @@ the details of <code class="docutils literal
notranslate"><span class="pre">DLig
<p>Now we can compile the IRModule on GPU, the similar way as we did on
CPU.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">exec</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">raw_data</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span
class="p">,</span> <span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"main"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https:// [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"main"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a><span class="p">)</span><s [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">gpu_out</span><span class="p">)</span>
<span class="c1"># Check the correctness of the results</span>
<span class="k">assert</span> <span class="n">np</span><span
class="o">.</span><span class="n">allclose</span><span class="p">(</span><span
class="n">cpu_out</span><span class="p">,</span> <span
class="n">gpu_out</span><span class="p">,</span> <span
class="n">atol</span><span class="o">=</span><span class="mf">1e-3</span><span
class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.00813762 0.17235449 0.13928246
-0.19248216 -0.05469557 0.23520309
- 0.05647276 0.01713393 0.07319915 0.06497044]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[-0.15167476 0.06588843 -0.21453409
-0.06856853 0.06713282 0.09697811
+ -0.08928692 -0.01189569 0.02223385 -0.0142656 ]]
</pre></div>
</div>
</section>
diff --git a/docs/get_started/tutorials/quick_start.html
b/docs/get_started/tutorials/quick_start.html
index e5bf8da9fd..d10a4a88b6 100644
--- a/docs/get_started/tutorials/quick_start.html
+++ b/docs/get_started/tutorials/quick_start.html
@@ -449,16 +449,16 @@ different devices.</p>
<a href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">target</span></a> <span class="o">=</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span
class="o">.</span><span class="n">target< [...]
<a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span class="n">compile</span [...]
<span class="n">device</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">()</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="n">data</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span
class="p">)</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"float32"</span><span
class="p">)</span>
<span class="n">tvm_data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">data</span><span class="p">,</span> <span
class="n">device</span><span class="o">=</span><span
class="n">device</span><span class="p">)</span>
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">np</span><span class="o">.</span><span
class="n">random</span><span class="o">.</span><span class="n">rand</span><span
class="p">(</span><span class="o">*</span><span class="n">param</span><span
class="o">.</sp [...]
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span
class="n">param</span><span class="p">,</span> <span class="n"> [...]
-<span class="nb">print</span><span class="p">(</span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"forward"</span><span
class="p">](</span><span class="n">tvm_data</span><span class="p">,</span>
<span class="o">*</span><a href="http [...]
+<span class="nb">print</span><span class="p">(</span><span
class="n">vm</span><span class="p">[</span><span
class="s2">"forward"</span><span class="p">](</span><span
class="n">tvm_data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">params</span></a><span class="p">)</span><s [...]
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[24545.648 27244.729 24525.357 26602.074
25662.295 25989.164 22992.693
- 23667.443 26247.26 23540.232]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[24854.854 25747.064 25267.877 25039.768
25553.775 24642.68 25147.45
+ 25204.746 25647.812 24801.225]]
</pre></div>
</div>
<p>Our goal is to bring machine learning to the application with any language
of interest,
@@ -466,8 +466,8 @@ with the minimum runtime support.</p>
<ul>
<li><p>Each function in IRModule becomes a runnable function in the runtime.
For example in LLM
cases, we can call <code class="docutils literal notranslate"><span
class="pre">prefill</span></code> and <code class="docutils literal
notranslate"><span class="pre">decode</span></code> functions directly.</p>
-<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"prefill"</span><span
class="p">](</span> [...]
-<span class="n">decoded_logits</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"decode"</span><span
class="p">](</span><span class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span>< [...]
+<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span
class="o">=</span> <span class="n">vm</span><span class="p">[</span><span
class="s2">"prefill"</span><span class="p">](</span><span
class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">)</span>
+<span class="n">decoded_logits</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"decode"</span><span class="p">](</span><span
class="n">inputs</span><span class="p">,</span> <span
class="n">weight</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">)</span>
</pre></div>
</div>
</li>
@@ -482,15 +482,15 @@ copy exchange with existing ecosystem (DLPack exchange
with PyTorch)</p>
</li>
<li><p>TVM runtime works in non-python environments, so it works on settings
such as mobile</p>
<div class="highlight-C++ notranslate"><div
class="highlight"><pre><span></span><span class="c1">// C++ snippet</span>
-<span class="n">runtime</span><span class="o">::</span><span
class="n">Module</span><span class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w">
</span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.r [...]
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="n">GetFunction</span><span
class="p">(</span><span class="s">"init"</span><span
class="p">)(...);</span>
-<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="n">GetFunction</span><span
class="p">(</span><span [...]
+<span class="n">runtime</span><span class="o">::</span><span
class="n">Module</span><span class="w"> </span><span class="n">vm</span><span
class="w"> </span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a><span class="p">.</span><span class="n">GetFunction [...]
+<span class="n">vm</span><span class="p">.</span><span
class="n">GetFunction</span><span class="p">(</span><span
class="s">"init"</span><span class="p">)(...);</span>
+<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><span class="n">vm</span><span class="p">.</span><span
class="n">GetFunction</span><span class="p">(</span><span
class="s">"prefill"</span><span class="p">)(</span><span
class="n">data</span><span class="p">,</span><span class="w"> </span><span
class="n">weight</span><span class="p">,</span><span class="w"> </span><span
class="n" [...]
</pre></div>
</div>
<div class="highlight-Java notranslate"><div
class="highlight"><pre><span></span><span class="c1">// Java snippet</span>
-<span class="n">Module</span><span class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w">
</span><span class="o">=</span><span class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="na">getFunction</span><span
class="p">(</span><span class="s">"init"</span><span
class="p">).</span><span class="na">pushArg</span><span
class="p">(...).</span><span class="na">invoke</span>< [...]
-<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">.</span><span class="na">getFunction</span><span
class="p">(</span><spa [...]
+<span class="n">Module</span><span class="w"> </span><span
class="n">vm</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span class="s">"l
[...]
+<span class="n">vm</span><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span
class="s">"init"</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(...).</span><span
class="na">invoke</span><span class="p">;</span>
+<span class="n">Tensor</span><span class="w"> </span><span
class="n">out</span><span class="w"> </span><span class="o">=</span><span
class="w"> </span><span class="n">vm</span><span class="p">.</span><span
class="na">getFunction</span><span class="p">(</span><span
class="s">"prefill"</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(</span><span
class="n">data</span><span class="p">).</span><span
class="na">pushArg</span><span class="p">(</span><spa [...]
</pre></div>
</div>
</li>
diff --git a/docs/get_started/tutorials/sg_execution_times.html
b/docs/get_started/tutorials/sg_execution_times.html
index b9e49ca6e0..3446271a52 100644
--- a/docs/get_started/tutorials/sg_execution_times.html
+++ b/docs/get_started/tutorials/sg_execution_times.html
@@ -294,7 +294,7 @@
<section id="computation-times">
<span
id="sphx-glr-get-started-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:05.376</strong> total execution time for 2 files <strong>from
get_started/tutorials</strong>:</p>
+<p><strong>00:07.513</strong> total execution time for 2 files <strong>from
get_started/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -316,11 +316,11 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
class="std std-ref">IRModule</span></a> (<code class="docutils literal
notranslate"><span class="pre">ir_module.py</span></code>)</p></td>
-<td><p>00:05.205</p></td>
+<td><p>00:07.343</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
class="std std-ref">Quick Start</span></a> (<code class="docutils literal
notranslate"><span class="pre">quick_start.py</span></code>)</p></td>
-<td><p>00:00.171</p></td>
+<td><p>00:00.170</p></td>
<td><p>0.0</p></td>
</tr>
</tbody>
diff --git a/docs/how_to/tutorials/cross_compilation_and_rpc.html
b/docs/how_to/tutorials/cross_compilation_and_rpc.html
index 921d56f2a2..d8265f3a02 100644
--- a/docs/how_to/tutorials/cross_compilation_and_rpc.html
+++ b/docs/how_to/tutorials/cross_compilation_and_rpc.html
@@ -473,7 +473,7 @@ device and returns the measured cost. Network overhead is
excluded.</p>
<span class="nb">print</span><span class="p">(</span><span
class="s2">"</span><span class="si">%g</span><span class="s2">
secs/op"</span> <span class="o">%</span> <span class="n">cost</span><span
class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>1.28e-07 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>1.25e-07 secs/op
</pre></div>
</div>
</section>
@@ -822,8 +822,8 @@ for ONNX models. Simply replace <code class="docutils
literal notranslate"><span
<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>Converted PyTorch model to Relax:
- Number of parameters: 4
Using local target for demonstration
-Exported library to: /tmp/tmplotr3l0s/model_deployed.so
-Saved parameters to: /tmp/tmplotr3l0s/model_params.npz
+Exported library to: /tmp/tmp_fauril4/model_deployed.so
+Saved parameters to: /tmp/tmp_fauril4/model_params.npz
RPC workflow (works for any remote device):
==================================================
diff --git a/docs/how_to/tutorials/customize_opt.html
b/docs/how_to/tutorials/customize_opt.html
index 760b41eb9f..c7fa2ed9c2 100644
--- a/docs/how_to/tutorials/customize_opt.html
+++ b/docs/how_to/tutorials/customize_opt.html
@@ -598,16 +598,16 @@ pushing the performance to the limit. The current
optimization may not be the be
<p>We can build and deploy the optimized model to the TVM runtime.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-functi [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class [...]
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-ba [...]
<a href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">np</span><span
class="o">.</span><span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"forward"</span><span
class="p">](</span><span class="n">data</span><span class="p">,</span> <span
class="o">*</span><a href="https [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"forward"</span><span class="p">](</span><span
class="n">data</span><span class="p">,</span> <span class="o">*</span><a
href="https://docs.python.org/3/library/stdtypes.html#list"
title="builtins.list" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">gpu_params</span></a><span class="p">)</span [...]
<span class="nb">print</span><span class="p">(</span><span
class="n">gpu_out</span><span class="p">)</span>
</pre></div>
</div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[26702.004 26262.46 24914.695 24891.164
25470.867 25811.766 25893.855
- 24336.852 26334.227 24946.672]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>[[23805.832 25195.34 24493.363 24099.348
25487.43 25607.45 24746.297
+ 25004.492 25226.61 24738.535]]
</pre></div>
</div>
</section>
diff --git a/docs/how_to/tutorials/e2e_opt_model.html
b/docs/how_to/tutorials/e2e_opt_model.html
index efde41a343..5d8dc104f4 100644
--- a/docs/how_to/tutorials/e2e_opt_model.html
+++ b/docs/how_to/tutorials/e2e_opt_model.html
@@ -328,8 +328,13 @@ PyTorch.</p>
<div class="sphx-glr-script-out highlight-none notranslate"><div
class="highlight"><pre><span></span>Downloading:
"https://download.pytorch.org/models/resnet18-f37072fd.pth" to
/workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
0%| | 0.00/44.7M [00:00<?, ?B/s]
- 47%|████▋ | 21.1M/44.7M [00:00<00:00, 221MB/s]
-100%|██████████| 44.7M/44.7M [00:00<00:00, 261MB/s]
+ 9%|▉ | 4.12M/44.7M [00:00<00:01, 42.5MB/s]
+ 22%|██▏ | 9.75M/44.7M [00:00<00:00, 52.1MB/s]
+ 33%|███▎ | 14.8M/44.7M [00:00<00:00, 35.4MB/s]
+ 50%|████▉ | 22.1M/44.7M [00:00<00:00, 48.1MB/s]
+ 65%|██████▌ | 29.2M/44.7M [00:00<00:00, 55.2MB/s]
+ 86%|████████▌ | 38.4M/44.7M [00:00<00:00, 67.3MB/s]
+100%|██████████| 44.7M/44.7M [00:00<00:00, 61.8MB/s]
</pre></div>
</div>
</section>
@@ -430,7 +435,7 @@ We skip this step in the CI environment.</p>
<span class="n">mod</span> <span class="o">=</span> <a
href="../../reference/api/python/tir/transform.html#tvm.tir.transform.DefaultGPUSchedule"
title="tvm.tir.transform.DefaultGPUSchedule"
class="sphx-glr-backref-module-tvm-tir-transform
sphx-glr-backref-type-py-function"><span class="n">tvm</span><span
class="o">.</span><span class="n">tir</span><span class="o">.</span><span
class="n">transform</span><span class="o">.</span><span
class="n">DefaultGPUSchedule</span></a><span cla [...]
<span class="n">ex</span> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span
class="n">compile</span></a><span class="p">(</span><span
class="n">mod</span><span class="p">,</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm [...]
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span
class="p">(</span><span class="s2">"cuda"</span><span
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
- <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">ex</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">ex</span><span class="p">,</span> <span class="n">dev</span><span
class="p">)</span>
<span class="c1"># Need to allocate data and params on GPU device</span>
<span class="n">gpu_data</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">random</span><span
class="o">.</span><span class="n">rand</span><span class="p">(</span><span
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span
class="p">,</span> <span class="mi">224< [...]
<span class="n">gpu_params</span> <span class="o">=</span> <span
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span
class="n">runtime</span><span class="o">.</span><span
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span
class="p">,</span> <span class="n">dev</span><span class="p">)</span> <span
class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span
class="n">params</span><span class="p">[</span><span class="s2" [...]
diff --git a/docs/how_to/tutorials/export_and_load_executable.html
b/docs/how_to/tutorials/export_and_load_executable.html
index 96632af5b1..0d9f5f4d63 100644
--- a/docs/how_to/tutorials/export_and_load_executable.html
+++ b/docs/how_to/tutorials/export_and_load_executable.html
@@ -441,7 +441,7 @@ runtime module directly.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="k">if</span> <a
href="https://docs.python.org/3/library/functions.html#bool"
title="builtins.bool" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">RUN_EXAMPLE</span></a><span class="p">:</span>
<span class="n">loaded_rt_mod</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">load_module</span><span
class="p">(</span><span class="nb">str</span><span class="p">(</span><span
class="n">library_path</span><span class="p">))</span>
<span class="n">dev</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">(</span><span class="mi">0</span><span class="p">)</span>
- <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">loaded_rt_mod</span><span class="p">,</span>
<span class="n">dev</span><span class="p">)</span>
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">loaded_rt_mod</span><span class="p">,</span> <span
class="n">dev</span><span class="p">)</span>
<span class="c1"># Prepare input data</span>
<span class="n">input_tensor</span> <span class="o">=</span> <span
class="n">torch</span><span class="o">.</span><span class="n">randn</span><span
class="p">(</span><span class="mi">1</span><span class="p">,</span> <span
class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span
class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span
class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span
class="o">.</span><span class="n">f [...]
@@ -522,7 +522,7 @@ of how to reload and run the model. Save this as <code
class="docutils literal n
<span class="c1"># Step 2: Create Virtual Machine</span>
<span class="n">device</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span
class="n">device</span><span class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span
class="p">)</span>
<span class="c1"># Step 3: Load parameters from the .npz file</span>
<span class="n">params_npz</span> <span class="o">=</span> <span
class="n">np</span><span class="o">.</span><span class="n">load</span><span
class="p">(</span><span
class="s2">"relax_export_artifacts/model_params.npz"</span><span
class="p">)</span>
@@ -624,7 +624,7 @@ for a comprehensive guide on:</p>
<span class="c1"># Step 4: Load and run on remote device</span>
<span class="n">lib</span> <span class="o">=</span> <span
class="n">remote</span><span class="o">.</span><span
class="n">load_module</span><span class="p">(</span><span
class="s2">"mlp_arm.so"</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span class="n">relax</span><span
class="o">.</span><span class="n">VirtualMachine</span></a><span
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span
class="n">remote</span><span class="o">.</span><span class="n">cpu</ [...]
+<span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><span
class="n">lib</span><span class="p">,</span> <span class="n">remote</span><span
class="o">.</span><span cla [...]
<span class="c1"># ... prepare input and params, then run inference</span>
</pre></div>
</div>
diff --git a/docs/how_to/tutorials/optimize_llm.html
b/docs/how_to/tutorials/optimize_llm.html
index f3be66cac5..cd966c59ac 100644
--- a/docs/how_to/tutorials/optimize_llm.html
+++ b/docs/how_to/tutorials/optimize_llm.html
@@ -725,7 +725,7 @@ is designed specifically for the LLMs.</p>
<span class="k">with</span> <a
href="../../reference/api/python/target.html#tvm.target.Target"
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">target</span></a><span class="p">:</span>
<a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">ex</span></a> <span class="o">=</span> <a
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile"
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span
class="n">tvm</span><span class="o">.</span><span class="n">compile</ [...]
- <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a> <span
class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm
sphx-glr-backref-type-py-class"><span cla [...]
+ <span class="n">vm</span> <span class="o">=</span> <a
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine"
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">relax</span><span class="o">.</span><span
class="n">VirtualMachine</span></a><span class="p">(</span><a
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable"
title="tvm.relax.VMExecutable" c [...]
</pre></div>
</div>
</section>
@@ -823,7 +823,7 @@ the model documentation for the correct tokenization and
prompt format.</p>
key and value tensors for the attention layer. Apache TVM provides a
PagedKVCache to store the
key and value tensors. We create the PagedKVCache with the specified
parameters.</p>
<div class="highlight-Python notranslate"><div
class="highlight"><pre><span></span><span class="k">if</span> <span
class="ow">not</span> <a
href="https://docs.python.org/3/library/functions.html#bool"
title="builtins.bool" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span
class="n">IS_IN_CI</span></a><span class="p">:</span>
- <span class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span
class="s2">"create_tir_paged_kv_cache"</span><span class="p">](</span>
+ <span class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"create_tir_paged_kv_cache"</span><span class="p">](</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="mi">1</span><span class="p">]),</span> <span
class="c1"># max_batch_size=1</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>
<span class="c1"># max_total_seq_len=2048</span>
<a href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>
<span class="c1"># prefill_chunk_size=2048</span>
@@ -840,7 +840,7 @@ compiled in the Relax IRModule to embed the tokens into the
hidden states.</p>
<span class="k">def</span><span class="w"> </span><span
class="nf">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">):</span>
- <span class="n">_embed</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"embed"</span><span
class="p">](</span><span class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span [...]
+ <span class="n">_embed</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"embed"</span><span class="p">](</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="c1"># Reshape hidden from [seq_len, hidden_size] to [1,
seq_len, hidden_size]</span>
<span class="n">_embed</span> <span class="o">=</span> <span
class="n">nd_view_func</span><span class="p">(</span><span
class="n">_embed</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="mi">1</span><span class="p">,</span> <span
class="n">_embed</span><span class="o">.</s [...]
<span class="k">return</span> <span class="n">_embed</span>
@@ -863,7 +863,7 @@ and <cite>end_forward_func</cite> to end the forward
pass.</p>
<span class="n">add_sequence_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">seq_id</span><span class="p">)</span>
<span class="n">hidden_states</span> <span class="o">=</span> <span
class="n">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">begin_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" cla [...]
- <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"prefill"</span><span
class="p">](</span><span class="n">hidden_states</ [...]
+ <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"prefill"</span><span class="p">](</span><span
class="n">hidden_states</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">end_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">)</span>
</pre></div>
</div>
@@ -895,7 +895,7 @@ IRModule to generate the token.</p>
<span class="n">tokens</span> <span class="o">=</span> <span
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span
class="n">np</span><span class="o">.</span><span class="n">array</span><span
class="p">([</span><span class="n">last_token</span><span
class="p">])</span><span class="o">.</span><span class="n">astype</span><span
class="p">(</span><span class="s2">"int32"< [...]
<span class="n">hidden_states</span> <span class="o">=</span> <span
class="n">embed</span><span class="p">(</span><span
class="n">tokens</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">begin_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">,</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" class="sphx-glr-backref-module-builtins
sphx-glr-backref-type-py-class"><span class="n">ShapeTuple</span></a><span
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a
href="https://docs.python.org/3/library/stdtypes.html#tuple"
title="builtins.tuple" [...]
- <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <a
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine"
title="tvm.runtime.vm.VirtualMachine"
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class
sphx-glr-backref-instance"><span class="n">vm</span></a><span
class="p">[</span><span class="s2">"decode"</span><span
class="p">](</span><span class="n">hidden_state [...]
+ <span class="n">logits</span><span class="p">,</span> <span
class="n">kv_cache</span> <span class="o">=</span> <span
class="n">vm</span><span class="p">[</span><span
class="s2">"decode"</span><span class="p">](</span><span
class="n">hidden_states</span><span class="p">,</span> <span
class="n">kv_cache</span><span class="p">,</span> <span
class="n">params</span><span class="p">)</span>
<span class="n">end_forward_func</span><span class="p">(</span><span
class="n">kv_cache</span><span class="p">)</span>
<span class="n">last_token</span> <span class="o">=</span> <span
class="n">sample_token</span><span class="p">(</span><span
class="n">logits</span><span class="p">)</span>
diff --git a/docs/how_to/tutorials/sg_execution_times.html
b/docs/how_to/tutorials/sg_execution_times.html
index 9b059e76f3..b28e408a71 100644
--- a/docs/how_to/tutorials/sg_execution_times.html
+++ b/docs/how_to/tutorials/sg_execution_times.html
@@ -294,7 +294,7 @@
<section id="computation-times">
<span id="sphx-glr-how-to-tutorials-sg-execution-times"></span><h1>Computation
times<a class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:32.273</strong> total execution time for 5 files <strong>from
how_to/tutorials</strong>:</p>
+<p><strong>00:33.484</strong> total execution time for 5 files <strong>from
how_to/tutorials</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -316,19 +316,19 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
class="std std-ref">Optimize Large Language Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">optimize_llm.py</span></code>)</p></td>
-<td><p>00:30.304</p></td>
+<td><p>00:29.694</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-odd"><td><p><a class="reference internal"
href="cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.790</p></td>
+<tr class="row-odd"><td><p><a class="reference internal"
href="e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code class="docutils
literal notranslate"><span class="pre">e2e_opt_model.py</span></code>)</p></td>
+<td><p>00:02.245</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-even"><td><p><a class="reference internal"
href="customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span class="pre">customize_opt.py</span></code>)</p></td>
-<td><p>00:00.680</p></td>
+<tr class="row-even"><td><p><a class="reference internal"
href="cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
+<td><p>00:00.800</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-odd"><td><p><a class="reference internal"
href="e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code class="docutils
literal notranslate"><span class="pre">e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.497</p></td>
+<tr class="row-odd"><td><p><a class="reference internal"
href="customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span class="pre">customize_opt.py</span></code>)</p></td>
+<td><p>00:00.743</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="export_and_load_executable.html#sphx-glr-how-to-tutorials-export-and-load-executable-py"><span
class="std std-ref">Export and Load Relax Executables</span></a> (<code
class="docutils literal notranslate"><span
class="pre">export_and_load_executable.py</span></code>)</p></td>
diff --git a/docs/objects.inv b/docs/objects.inv
index 61a558f695..c5c80d674e 100644
Binary files a/docs/objects.inv and b/docs/objects.inv differ
diff --git a/docs/reference/api/python/runtime/vm.html
b/docs/reference/api/python/runtime/vm.html
index 8bcfff66cd..68e5797075 100644
--- a/docs/reference/api/python/runtime/vm.html
+++ b/docs/reference/api/python/runtime/vm.html
@@ -490,7 +490,7 @@ more details.</p>
<div class="admonition seealso">
<p class="admonition-title">See also</p>
<dl class="simple">
-<dt><a class="reference internal"
href="#tvm.runtime.vm.VMInstrumentReturnKind"
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj
docutils literal notranslate"><span
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible
return values in VM.</p>
+<dt><a class="reference internal"
href="../relax/relax.html#tvm.relax.VMInstrumentReturnKind"
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj
docutils literal notranslate"><span
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible
return values in VM.</p>
</dd>
</dl>
</div>
diff --git a/docs/searchindex.js b/docs/searchindex.js
index 75fa73ce02..0fad931fe6 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"1. Cross Compile TVM Runtime": [[40,
"cross-compile-tvm-runtime"]], "1. The lack of numpy on device machine caused
the RPC server can\u2019t be launched.": [[40,
"the-lack-of-numpy-on-device-machine-caused-the-rpc-server-can-t-be-launched"]],
"2. Pack and Deploy to Device Machine": [[40,
"pack-and-deploy-to-device-machine"]], "2. The lack of cloudpickle on device
machine caused the RPC server can\u2019t be launched.": [[40,
"the-lack-of-cloudpickle-on-devi [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"1. Cross Compile TVM Runtime": [[40,
"cross-compile-tvm-runtime"]], "1. The lack of numpy on device machine caused
the RPC server can\u2019t be launched.": [[40,
"the-lack-of-numpy-on-device-machine-caused-the-rpc-server-can-t-be-launched"]],
"2. Pack and Deploy to Device Machine": [[40,
"pack-and-deploy-to-device-machine"]], "2. The lack of cloudpickle on device
machine caused the RPC server can\u2019t be launched.": [[40,
"the-lack-of-cloudpickle-on-devi [...]
\ No newline at end of file
diff --git a/docs/sg_execution_times.html b/docs/sg_execution_times.html
index a7df944366..0a7969a348 100644
--- a/docs/sg_execution_times.html
+++ b/docs/sg_execution_times.html
@@ -294,7 +294,7 @@
<section id="computation-times">
<span id="sphx-glr-sg-execution-times"></span><h1>Computation times<a
class="headerlink" href="#computation-times" title="Link to this
heading"></a></h1>
-<p><strong>00:38.284</strong> total execution time for 11 files <strong>from
all galleries</strong>:</p>
+<p><strong>00:41.623</strong> total execution time for 11 files <strong>from
all galleries</strong>:</p>
<div class="docutils container">
<style scoped>
<link
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css"
rel="stylesheet" />
@@ -316,43 +316,43 @@ $(document).ready( function () {
</thead>
<tbody>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
class="std std-ref">Optimize Large Language Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/optimize_llm.py</span></code>)</p></td>
-<td><p>00:30.304</p></td>
+<td><p>00:29.694</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="get_started/tutorials/ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
class="std std-ref">IRModule</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../get_started/tutorials/ir_module.py</span></code>)</p></td>
-<td><p>00:05.205</p></td>
+<td><p>00:07.343</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.790</p></td>
+<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/e2e_opt_model.py</span></code>)</p></td>
+<td><p>00:02.245</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-odd"><td><p><a class="reference internal"
href="how_to/tutorials/customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span
class="pre">../how_to/tutorials/customize_opt.py</span></code>)</p></td>
-<td><p>00:00.680</p></td>
+<tr class="row-odd"><td><p><a class="reference internal"
href="how_to/tutorials/cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
class="std std-ref">Cross Compilation and RPC</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/cross_compilation_and_rpc.py</span></code>)</p></td>
+<td><p>00:00.800</p></td>
<td><p>0.0</p></td>
</tr>
-<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
class="std std-ref">End-to-End Optimize Model</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.497</p></td>
+<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
class="std std-ref">Customize Optimization</span></a> (<code class="docutils
literal notranslate"><span
class="pre">../how_to/tutorials/customize_opt.py</span></code>)</p></td>
+<td><p>00:00.743</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.290</p></td>
+<td><p>00:00.292</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="get_started/tutorials/quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
class="std std-ref">Quick Start</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../get_started/tutorials/quick_start.py</span></code>)</p></td>
-<td><p>00:00.171</p></td>
+<td><p>00:00.170</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/tensor_ir/tutorials/tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
class="std std-ref">TensorIR Creation</span></a> (<code class="docutils
literal notranslate"><span
class="pre">../deep_dive/tensor_ir/tutorials/tir_creation.py</span></code>)</p></td>
-<td><p>00:00.170</p></td>
+<td><p>00:00.168</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="deep_dive/relax/tutorials/relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
class="std std-ref">Relax Creation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/relax/tutorials/relax_creation.py</span></code>)</p></td>
-<td><p>00:00.110</p></td>
+<td><p>00:00.104</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-odd"><td><p><a class="reference internal"
href="deep_dive/relax/tutorials/relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
class="std std-ref">Transformation</span></a> (<code class="docutils literal
notranslate"><span
class="pre">../deep_dive/relax/tutorials/relax_transformation.py</span></code>)</p></td>
-<td><p>00:00.064</p></td>
+<td><p>00:00.062</p></td>
<td><p>0.0</p></td>
</tr>
<tr class="row-even"><td><p><a class="reference internal"
href="how_to/tutorials/export_and_load_executable.html#sphx-glr-how-to-tutorials-export-and-load-executable-py"><span
class="std std-ref">Export and Load Relax Executables</span></a> (<code
class="docutils literal notranslate"><span
class="pre">../how_to/tutorials/export_and_load_executable.py</span></code>)</p></td>