This is an automated email from the ASF dual-hosted git repository.

tqchen pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/tvm-site.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new ce1d635ce9d deploying docs 
(apache/tvm@58d6a5a8f972002ba17f387d961c867a418d987e)
ce1d635ce9d is described below

commit ce1d635ce9dee118a963134150c78140282cbebc
Author: tvm-bot <[email protected]>
AuthorDate: Tue May 26 17:45:53 2026 +0000

    deploying docs (apache/tvm@58d6a5a8f972002ba17f387d961c867a418d987e)
---
 .../import_model.zip                               | Bin 35138 -> 35138 bytes
 .../dlight_gpu_scheduling.zip                      | Bin 26311 -> 26311 bytes
 .../11c11e53c7dace51a8be968ee169ed0d/ir_module.zip | Bin 23790 -> 23790 bytes
 .../tir_transformation.zip                         | Bin 15923 -> 15923 bytes
 .../meta_schedule.zip                              | Bin 24239 -> 24239 bytes
 .../mix_python_and_tvm_with_pymodule.zip           | Bin 39023 -> 39023 bytes
 .../relax_creation.zip                             | Bin 22503 -> 22503 bytes
 .../relax_transformation.zip                       | Bin 11480 -> 11480 bytes
 .../optimize_llm.zip                               | Bin 54007 -> 54007 bytes
 .../bring_your_own_codegen.zip                     | Bin 18423 -> 18423 bytes
 .../e2e_opt_model.zip                              | Bin 14501 -> 14501 bytes
 .../quick_start.zip                                | Bin 16252 -> 16252 bytes
 .../export_and_load_executable.zip                 | Bin 31408 -> 31408 bytes
 .../tir_creation.zip                               | Bin 24545 -> 24545 bytes
 .../cross_compilation_and_rpc.zip                  | Bin 62975 -> 62975 bytes
 .../customize_opt.zip                              | Bin 20544 -> 20544 bytes
 .../relax/tutorials/sg_execution_times.rst.txt     |   4 ++--
 .../tensor_ir/tutorials/sg_execution_times.rst.txt |   8 +++----
 .../tensor_ir/tutorials/tir_creation.rst.txt       |  20 ++++++++--------
 .../tensor_ir/tutorials/tir_transformation.rst.txt |   6 ++---
 .../get_started/tutorials/ir_module.rst.txt        |   8 +++----
 .../get_started/tutorials/quick_start.rst.txt      |   4 ++--
 .../tutorials/sg_execution_times.rst.txt           |   6 ++---
 .../tutorials/cross_compilation_and_rpc.rst.txt    |   6 ++---
 .../how_to/tutorials/customize_opt.rst.txt         |   4 ++--
 .../how_to/tutorials/e2e_opt_model.rst.txt         |   2 +-
 .../how_to/tutorials/sg_execution_times.rst.txt    |  14 +++++------
 docs/_sources/sg_execution_times.rst.txt           |  26 ++++++++++-----------
 .../relax/tutorials/sg_execution_times.html        |   4 ++--
 .../tensor_ir/tutorials/sg_execution_times.html    |   8 +++----
 .../tensor_ir/tutorials/tir_creation.html          |  20 ++++++++--------
 .../tensor_ir/tutorials/tir_transformation.html    |   6 ++---
 docs/get_started/tutorials/ir_module.html          |  16 ++++++-------
 docs/get_started/tutorials/quick_start.html        |  24 +++++++++----------
 docs/get_started/tutorials/sg_execution_times.html |   6 ++---
 docs/how_to/tutorials/bring_your_own_codegen.html  |   4 ++--
 .../tutorials/cross_compilation_and_rpc.html       |   6 ++---
 docs/how_to/tutorials/customize_opt.html           |   8 +++----
 docs/how_to/tutorials/e2e_opt_model.html           |   6 ++---
 .../tutorials/export_and_load_executable.html      |   8 +++----
 docs/how_to/tutorials/import_model.html            |   4 ++--
 docs/how_to/tutorials/optimize_llm.html            |  10 ++++----
 docs/how_to/tutorials/sg_execution_times.html      |  14 +++++------
 docs/objects.inv                                   | Bin 23886 -> 23892 bytes
 docs/reference/api/python/relax/op.html            |   2 +-
 docs/reference/api/python/relax/training.html      |   2 +-
 docs/reference/api/python/runtime/vm.html          |   2 +-
 docs/searchindex.js                                |   2 +-
 docs/sg_execution_times.html                       |  26 ++++++++++-----------
 49 files changed, 143 insertions(+), 143 deletions(-)

diff --git a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip 
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip
index 5307fa722d4..2e0209f902b 100644
Binary files 
a/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip and 
b/docs/_downloads/050dda7aeddba33e083ae36606d3270f/import_model.zip differ
diff --git 
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip
index b2cd955832d..4a7ab83dc06 100644
Binary files 
a/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
and 
b/docs/_downloads/0eec6b3ea6ba863d0ed14b037c4b83ea/dlight_gpu_scheduling.zip 
differ
diff --git a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip 
b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip
index d110c64a5ba..9c10de99cf5 100644
Binary files a/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip 
and b/docs/_downloads/11c11e53c7dace51a8be968ee169ed0d/ir_module.zip differ
diff --git 
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip 
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip
index 4e0e7a38de2..604bf79c0d2 100644
Binary files 
a/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip and 
b/docs/_downloads/18ba0d2ee8120824175aaef66bc9c9bf/tir_transformation.zip differ
diff --git a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip 
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip
index ad5fec8450e..6726a366619 100644
Binary files 
a/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip and 
b/docs/_downloads/249a4acaa182fd00b9c381b53025777d/meta_schedule.zip differ
diff --git 
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
index aa6adc62c50..6c806839696 100644
Binary files 
a/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 and 
b/docs/_downloads/373278b9f6fc686adeaa0219598fe78b/mix_python_and_tvm_with_pymodule.zip
 differ
diff --git 
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip 
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip
index 389c97e0066..0636a870aaf 100644
Binary files 
a/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip and 
b/docs/_downloads/4753776bbe68e7c9ee4d19117973fc8b/relax_creation.zip differ
diff --git 
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip 
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip
index 5d896cd7b5d..011f90eb564 100644
Binary files 
a/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip and 
b/docs/_downloads/7d201684dfa095a5ea48d98e9a2ef7ad/relax_transformation.zip 
differ
diff --git a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip 
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip
index 2d1dff97b63..618432f3ff5 100644
Binary files 
a/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip and 
b/docs/_downloads/83e85f38cf16f1d926d06615fd54095c/optimize_llm.zip differ
diff --git 
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip
index ccd5ed4be1d..930d8aaa4dc 100644
Binary files 
a/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
and 
b/docs/_downloads/8c05f1d580979a2c26428a509c07ed72/bring_your_own_codegen.zip 
differ
diff --git a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip 
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip
index d78c891baee..fd4d5941788 100644
Binary files 
a/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip and 
b/docs/_downloads/a7dd7652b2ad50f82d7b739ce3645799/e2e_opt_model.zip differ
diff --git a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip 
b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip
index e2053577edb..039648d84f7 100644
Binary files a/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip 
and b/docs/_downloads/bb7db6678496193ed0c55d3b95fa6778/quick_start.zip differ
diff --git 
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
index 877217e72a4..2c41c6aea53 100644
Binary files 
a/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 and 
b/docs/_downloads/bc875d02d5382abc9ea5fb9eb2c1de2c/export_and_load_executable.zip
 differ
diff --git a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip 
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip
index 7e3a8e33235..6ec2be8fb9e 100644
Binary files 
a/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip and 
b/docs/_downloads/be26483bb70b8468499a01c55e8e866c/tir_creation.zip differ
diff --git 
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
index 8a7efdcac84..c9ac8593077 100644
Binary files 
a/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 and 
b/docs/_downloads/f69380821f417ef2210f45503d81bded/cross_compilation_and_rpc.zip
 differ
diff --git a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip 
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip
index 7a1c56f7caf..0ea6ac2b8df 100644
Binary files 
a/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip and 
b/docs/_downloads/f69433a4a80715725df90d1386679956/customize_opt.zip differ
diff --git a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
index 02c94414cc1..28636801472 100644
--- a/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/relax/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:00.207** total execution time for 2 files **from 
deep_dive/relax/tutorials**:
+**00:00.210** total execution time for 2 files **from 
deep_dive/relax/tutorials**:
 
 .. container::
 
@@ -33,7 +33,7 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py` 
(``relax_creation.py``)
-     - 00:00.153
+     - 00:00.156
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py` 
(``relax_transformation.py``)
      - 00:00.054
diff --git 
a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
index b9f3d0213f4..aa55d98193b 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:00.700** total execution time for 4 files **from 
deep_dive/tensor_ir/tutorials**:
+**00:00.709** total execution time for 4 files **from 
deep_dive/tensor_ir/tutorials**:
 
 .. container::
 
@@ -33,13 +33,13 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py` 
(``tir_transformation.py``)
-     - 00:00.296
+     - 00:00.301
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py` 
(``tir_creation.py``)
-     - 00:00.278
+     - 00:00.282
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py` 
(``dlight_gpu_scheduling.py``)
-     - 00:00.118
+     - 00:00.119
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py` 
(``meta_schedule.py``)
      - 00:00.007
diff --git a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
index 26891cd60f4..6e8766eb425 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_creation.rst.txt
@@ -321,17 +321,17 @@ Now let's check the runtime dynamic shape inference:
 
  .. code-block:: none
 
-    [[0.95137763 0.85107917 0.3689601  1.0835637 ]
-     [1.0141339  0.8445324  0.90911806 1.6136544 ]
-     [0.61329514 0.30123305 0.7088936  1.0339592 ]
-     [0.3933009  0.32253462 0.5984916  0.91722345]]
-    [[30.018133 27.544096 33.03013  ... 28.698242 32.912727 33.821335]
-     [33.676186 29.869184 36.656418 ... 29.558659 33.14984  33.684513]
-     [32.54471  26.390018 33.711388 ... 28.185299 30.113274 30.87361 ]
+    [[0.9666659  1.2409723  1.5023336  0.59038645]
+     [1.466954   1.6336558  2.014125   0.88805574]
+     [0.84090245 0.84245896 1.2488935  0.5728023 ]
+     [1.6342511  1.7535877  2.2908895  0.8626775 ]]
+    [[31.720383 30.115713 31.008839 ... 35.727104 27.078178 32.137962]
+     [31.90414  31.903435 33.14347  ... 34.99326  31.078535 32.050144]
+     [32.191025 29.924765 31.966501 ... 32.56062  28.905807 31.945398]
      ...
-     [33.102913 26.474215 33.315556 ... 28.016031 29.267859 31.99489 ]
-     [30.29241  29.473051 34.34725  ... 28.928728 32.059017 32.707542]
-     [34.434856 28.675213 36.80958  ... 29.167341 34.82008  33.18891 ]]
+     [32.578815 31.375263 32.65559  ... 34.989735 29.628883 30.835232]
+     [35.372784 34.941742 35.593662 ... 38.25491  29.94823  33.333088]
+     [31.666214 30.27533  31.315958 ... 35.74374  29.301    30.93808 ]]
 
 
 
diff --git 
a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt 
b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
index 06f7c0c9b97..825a207a158 100644
--- a/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
+++ b/docs/_sources/deep_dive/tensor_ir/tutorials/tir_transformation.rst.txt
@@ -120,7 +120,7 @@ original implementation.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       2.5840       2.5840       2.5840       2.5840       0.0000              
    
+       2.5827       2.5827       2.5827       2.5827       0.0000              
    
 
 
 
@@ -294,7 +294,7 @@ action involves reordering these two loops.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       0.8664       0.8664       0.8664       0.8664       0.0000              
    
+       0.8629       0.8629       0.8629       0.8629       0.0000              
    
 
 
 
@@ -424,7 +424,7 @@ from the reduction update via the **decompose_reduction** 
primitive.
 
     Execution time summary:
      mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)  
-       0.3553       0.3553       0.3553       0.3553       0.0000              
    
+       0.3554       0.3554       0.3554       0.3554       0.0000              
    
 
 
 
diff --git a/docs/_sources/get_started/tutorials/ir_module.rst.txt 
b/docs/_sources/get_started/tutorials/ir_module.rst.txt
index 4d9472bf1e4..aeec624a78c 100644
--- a/docs/_sources/get_started/tutorials/ir_module.rst.txt
+++ b/docs/_sources/get_started/tutorials/ir_module.rst.txt
@@ -696,8 +696,8 @@ We can deploy the IRModule on CPU by specifying the target 
as ``llvm``.
 
  .. code-block:: none
 
-    [[ 0.02594689  0.12035388 -0.09061812 -0.01100541 -0.09562254  0.1914179
-      -0.18710697  0.08637098 -0.08076213 -0.08568094]]
+    [[-0.01371118 -0.16339105  0.09833568 -0.17663041  0.01454209  0.11325479
+      -0.12773927 -0.15076809  0.13691261  0.15500405]]
 
 
 
@@ -763,8 +763,8 @@ Now we can compile the IRModule on GPU, the similar way as 
we did on CPU.
 
  .. code-block:: none
 
-    [[ 0.02594695  0.12035395 -0.09061813 -0.01100544 -0.09562249  0.1914179
-      -0.18710697  0.08637096 -0.08076215 -0.08568095]]
+    [[-0.01371119 -0.16339102  0.0983357  -0.17663044  0.01454208  0.11325476
+      -0.12773928 -0.15076812  0.13691261  0.15500408]]
 
 
 
diff --git a/docs/_sources/get_started/tutorials/quick_start.rst.txt 
b/docs/_sources/get_started/tutorials/quick_start.rst.txt
index 2d80f361e41..5fbee4f0ffe 100644
--- a/docs/_sources/get_started/tutorials/quick_start.rst.txt
+++ b/docs/_sources/get_started/tutorials/quick_start.rst.txt
@@ -224,8 +224,8 @@ different devices.
 
  .. code-block:: none
 
-    [[25488.164 24824.139 24354.932 25591.982 24814.775 26580.848 23389.242
-      25353.81  24581.1   23791.096]]
+    [[24241.857 24889.234 25745.719 25249.332 25855.055 25478.053 26986.88
+      25268.55  24716.8   24717.395]]
 
 
 
diff --git a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
index ffd8fd3e76a..f4af3bd3d2c 100644
--- a/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/get_started/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:06.237** total execution time for 2 files **from get_started/tutorials**:
+**00:07.509** total execution time for 2 files **from get_started/tutorials**:
 
 .. container::
 
@@ -33,8 +33,8 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_get_started_tutorials_ir_module.py` (``ir_module.py``)
-     - 00:06.082
+     - 00:07.352
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_quick_start.py` 
(``quick_start.py``)
-     - 00:00.156
+     - 00:00.157
      - 0.0
diff --git a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt 
b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
index 30811b08104..8e573ce8862 100644
--- a/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
+++ b/docs/_sources/how_to/tutorials/cross_compilation_and_rpc.rst.txt
@@ -267,7 +267,7 @@ device and returns the measured cost. Network overhead is 
excluded.
 
  .. code-block:: none
 
-    9.7e-08 secs/op
+    9.3e-08 secs/op
 
 
 
@@ -857,8 +857,8 @@ This workflow is applicable to various deployment scenarios:
     Converted PyTorch model to Relax:
       - Number of parameters: 4
     Using local target for demonstration
-    Exported library to: /tmp/tmpj7lc8xaw/model_deployed.so
-    Saved parameters to: /tmp/tmpj7lc8xaw/model_params.npz
+    Exported library to: /tmp/tmpiw5zwg5w/model_deployed.so
+    Saved parameters to: /tmp/tmpiw5zwg5w/model_params.npz
 
     RPC workflow (works for any remote device):
     ==================================================
diff --git a/docs/_sources/how_to/tutorials/customize_opt.rst.txt 
b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
index 015585e6c1d..0f9b4040dd2 100644
--- a/docs/_sources/how_to/tutorials/customize_opt.rst.txt
+++ b/docs/_sources/how_to/tutorials/customize_opt.rst.txt
@@ -426,8 +426,8 @@ We can build and deploy the optimized model to the TVM 
runtime.
 
  .. code-block:: none
 
-    [[25360.273 25977.055 26176.404 26010.953 23686.574 24671.594 25639.94
-      26352.355 26981.152 24812.078]]
+    [[24617.4   24113.621 23790.748 25276.402 22978.486 24189.844 24161.701
+      25012.629 24334.504 21885.848]]
 
 
 
diff --git a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt 
b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
index 2fc22948433..f64db2e0767 100644
--- a/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
+++ b/docs/_sources/how_to/tutorials/e2e_opt_model.rst.txt
@@ -54,7 +54,7 @@ PyTorch.
  .. code-block:: none
 
     Downloading: "https://download.pytorch.org/models/resnet18-f37072fd.pth"; 
to /workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
-       0%|          | 0.00/44.7M [00:00<?, ?B/s]      66%|██████▌   | 
29.4M/44.7M [00:00<00:00, 308MB/s]     100%|██████████| 44.7M/44.7M 
[00:00<00:00, 336MB/s]
+       0%|          | 0.00/44.7M [00:00<?, ?B/s]      67%|██████▋   | 
29.8M/44.7M [00:00<00:00, 311MB/s]     100%|██████████| 44.7M/44.7M 
[00:00<00:00, 338MB/s]
 
 
 
diff --git a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt 
b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
index 4ffc2306132..c7b6c4685c9 100644
--- a/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
+++ b/docs/_sources/how_to/tutorials/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:15.518** total execution time for 8 files **from how_to/tutorials**:
+**00:18.827** total execution time for 8 files **from how_to/tutorials**:
 
 .. container::
 
@@ -33,22 +33,22 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` (``optimize_llm.py``)
-     - 00:10.311
+     - 00:10.447
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_import_model.py` (``import_model.py``)
-     - 00:03.353
+     - 00:06.478
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` (``customize_opt.py``)
-     - 00:00.745
+     - 00:00.712
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` (``e2e_opt_model.py``)
-     - 00:00.584
+     - 00:00.665
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py` 
(``cross_compilation_and_rpc.py``)
-     - 00:00.479
+     - 00:00.476
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py` 
(``bring_your_own_codegen.py``)
-     - 00:00.040
+     - 00:00.042
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_mix_python_and_tvm_with_pymodule.py` 
(``mix_python_and_tvm_with_pymodule.py``)
      - 00:00.004
diff --git a/docs/_sources/sg_execution_times.rst.txt 
b/docs/_sources/sg_execution_times.rst.txt
index 822b487617a..3fcce8bfba7 100644
--- a/docs/_sources/sg_execution_times.rst.txt
+++ b/docs/_sources/sg_execution_times.rst.txt
@@ -6,7 +6,7 @@
 
 Computation times
 =================
-**00:22.663** total execution time for 16 files **from all galleries**:
+**00:27.255** total execution time for 16 files **from all galleries**:
 
 .. container::
 
@@ -33,43 +33,43 @@ Computation times
      - Time
      - Mem (MB)
    * - :ref:`sphx_glr_how_to_tutorials_optimize_llm.py` 
(``../how_to/tutorials/optimize_llm.py``)
-     - 00:10.311
+     - 00:10.447
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_ir_module.py` 
(``../get_started/tutorials/ir_module.py``)
-     - 00:06.082
+     - 00:07.352
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_import_model.py` 
(``../how_to/tutorials/import_model.py``)
-     - 00:03.353
+     - 00:06.478
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_customize_opt.py` 
(``../how_to/tutorials/customize_opt.py``)
-     - 00:00.745
+     - 00:00.712
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_e2e_opt_model.py` 
(``../how_to/tutorials/e2e_opt_model.py``)
-     - 00:00.584
+     - 00:00.665
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_cross_compilation_and_rpc.py` 
(``../how_to/tutorials/cross_compilation_and_rpc.py``)
-     - 00:00.479
+     - 00:00.476
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_transformation.py` 
(``../deep_dive/tensor_ir/tutorials/tir_transformation.py``)
-     - 00:00.296
+     - 00:00.301
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_tir_creation.py` 
(``../deep_dive/tensor_ir/tutorials/tir_creation.py``)
-     - 00:00.278
+     - 00:00.282
      - 0.0
    * - :ref:`sphx_glr_get_started_tutorials_quick_start.py` 
(``../get_started/tutorials/quick_start.py``)
-     - 00:00.156
+     - 00:00.157
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_creation.py` 
(``../deep_dive/relax/tutorials/relax_creation.py``)
-     - 00:00.153
+     - 00:00.156
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_dlight_gpu_scheduling.py` 
(``../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py``)
-     - 00:00.118
+     - 00:00.119
      - 0.0
    * - :ref:`sphx_glr_deep_dive_relax_tutorials_relax_transformation.py` 
(``../deep_dive/relax/tutorials/relax_transformation.py``)
      - 00:00.054
      - 0.0
    * - :ref:`sphx_glr_how_to_tutorials_bring_your_own_codegen.py` 
(``../how_to/tutorials/bring_your_own_codegen.py``)
-     - 00:00.040
+     - 00:00.042
      - 0.0
    * - :ref:`sphx_glr_deep_dive_tensor_ir_tutorials_meta_schedule.py` 
(``../deep_dive/tensor_ir/tutorials/meta_schedule.py``)
      - 00:00.007
diff --git a/docs/deep_dive/relax/tutorials/sg_execution_times.html 
b/docs/deep_dive/relax/tutorials/sg_execution_times.html
index de29545288a..a1e0b6934b6 100644
--- a/docs/deep_dive/relax/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/relax/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-deep-dive-relax-tutorials-sg-execution-times"></span><h1>Computation
 times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:00.207</strong> total execution time for 2 files <strong>from 
deep_dive/relax/tutorials</strong>:</p>
+<p><strong>00:00.210</strong> total execution time for 2 files <strong>from 
deep_dive/relax/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,7 +319,7 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
 class="std std-ref">Relax Creation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">relax_creation.py</span></code>)</p></td>
-<td><p>00:00.153</p></td>
+<td><p>00:00.156</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">relax_transformation.py</span></code>)</p></td>
diff --git a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html 
b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
index 413511190be..6898e9b5f3c 100644
--- a/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
+++ b/docs/deep_dive/tensor_ir/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-deep-dive-tensor-ir-tutorials-sg-execution-times"></span><h1>Computation
 times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:00.700</strong> total execution time for 4 files <strong>from 
deep_dive/tensor_ir/tutorials</strong>:</p>
+<p><strong>00:00.709</strong> total execution time for 4 files <strong>from 
deep_dive/tensor_ir/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,15 +319,15 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span class="pre">tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.296</p></td>
+<td><p>00:00.301</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
 class="std std-ref">TensorIR Creation</span></a> (<code class="docutils 
literal notranslate"><span class="pre">tir_creation.py</span></code>)</p></td>
-<td><p>00:00.278</p></td>
+<td><p>00:00.282</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
 class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.118</p></td>
+<td><p>00:00.119</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
 class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">meta_schedule.py</span></code>)</p></td>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html 
b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
index f37b089452f..98c431db7fc 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_creation.html
@@ -495,17 +495,17 @@ be used to ascertain the shape and data type of a 
TensorIR.</p>
 <span class="nb">print</span><span class="p">(</span><span 
class="n">evaluate_dynamic_shape</span><span class="p">(</span><span 
class="n">dyn_shape_lib</span><span class="p">,</span> <span 
class="n">m</span><span class="o">=</span><span class="mi">64</span><span 
class="p">,</span> <span class="n">n</span><span class="o">=</span><span 
class="mi">64</span><span class="p">,</span> <a 
href="../../../reference/api/python/tirx/tirx.html#tvm.tirx.IterVar" 
title="tvm.tirx.IterVar" class="sphx-gl [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[0.95137763 0.85107917 0.3689601  
1.0835637 ]
- [1.0141339  0.8445324  0.90911806 1.6136544 ]
- [0.61329514 0.30123305 0.7088936  1.0339592 ]
- [0.3933009  0.32253462 0.5984916  0.91722345]]
-[[30.018133 27.544096 33.03013  ... 28.698242 32.912727 33.821335]
- [33.676186 29.869184 36.656418 ... 29.558659 33.14984  33.684513]
- [32.54471  26.390018 33.711388 ... 28.185299 30.113274 30.87361 ]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[0.9666659  1.2409723  1.5023336  
0.59038645]
+ [1.466954   1.6336558  2.014125   0.88805574]
+ [0.84090245 0.84245896 1.2488935  0.5728023 ]
+ [1.6342511  1.7535877  2.2908895  0.8626775 ]]
+[[31.720383 30.115713 31.008839 ... 35.727104 27.078178 32.137962]
+ [31.90414  31.903435 33.14347  ... 34.99326  31.078535 32.050144]
+ [32.191025 29.924765 31.966501 ... 32.56062  28.905807 31.945398]
  ...
- [33.102913 26.474215 33.315556 ... 28.016031 29.267859 31.99489 ]
- [30.29241  29.473051 34.34725  ... 28.928728 32.059017 32.707542]
- [34.434856 28.675213 36.80958  ... 29.167341 34.82008  33.18891 ]]
+ [32.578815 31.375263 32.65559  ... 34.989735 29.628883 30.835232]
+ [35.372784 34.941742 35.593662 ... 38.25491  29.94823  33.333088]
+ [31.666214 30.27533  31.315958 ... 35.74374  29.301    30.93808 ]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html 
b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
index 6e1597fd45c..8d15f85bb20 100644
--- a/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
+++ b/docs/deep_dive/tensor_ir/tutorials/tir_transformation.html
@@ -374,7 +374,7 @@ original implementation.</p>
 </div>
 <div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   2.5840       2.5840       2.5840       2.5840       0.0000
+   2.5827       2.5827       2.5827       2.5827       0.0000
 </pre></div>
 </div>
 <section id="initialization-schedule">
@@ -472,7 +472,7 @@ class Module:
 
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   0.8664       0.8664       0.8664       0.8664       0.0000
+   0.8629       0.8629       0.8629       0.8629       0.0000
 </pre></div>
 </div>
 </section>
@@ -568,7 +568,7 @@ class Module:
 
 Execution time summary:
  mean (ms)   median (ms)    max (ms)     min (ms)     std (ms)
-   0.3553       0.3553       0.3553       0.3553       0.0000
+   0.3554       0.3554       0.3554       0.3554       0.0000
 </pre></div>
 </div>
 </section>
diff --git a/docs/get_started/tutorials/ir_module.html 
b/docs/get_started/tutorials/ir_module.html
index 23c17c8f0a6..2890ba1eb9b 100644
--- a/docs/get_started/tutorials/ir_module.html
+++ b/docs/get_started/tutorials/ir_module.html
@@ -811,16 +811,16 @@ backends.</p>
 <p>We can deploy the IRModule on CPU by specifying the target as <code 
class="docutils literal notranslate"><span class="pre">llvm</span></code>.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
 
 <span class="n">raw_data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span 
class="p">)</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;float32&quot;</span><span 
class="p">)</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">raw_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
-<span class="n">cpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https:// [...]
+<span class="n">cpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#dict"; 
title="builtins.dict" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params_from_torch</span></a><span class="p">[</ [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">cpu_out</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.02594689  0.12035388 -0.09061812 
-0.01100541 -0.09562254  0.1914179
-  -0.18710697  0.08637098 -0.08076213 -0.08568094]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[-0.01371118 -0.16339105  0.09833568 
-0.17663041  0.01454209  0.11325479
+  -0.12773927 -0.15076809  0.13691261  0.15500405]]
 </pre></div>
 </div>
 </section>
@@ -843,19 +843,19 @@ the details of <code class="docutils literal 
notranslate"><span class="pre">DLig
 <p>Now we can compile the IRModule on GPU, the similar way as we did on 
CPU.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-func [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
 <span class="c1"># Need to allocate data and params on GPU device</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">raw_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https:// [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a><span class="p">)</span><s [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">gpu_out</span><span class="p">)</span>
 
 <span class="c1"># Check the correctness of the results</span>
 <span class="k">assert</span> <span class="n">np</span><span 
class="o">.</span><span class="n">allclose</span><span class="p">(</span><span 
class="n">cpu_out</span><span class="p">,</span> <span 
class="n">gpu_out</span><span class="p">,</span> <span 
class="n">atol</span><span class="o">=</span><span class="mf">1e-3</span><span 
class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[ 0.02594695  0.12035395 -0.09061813 
-0.01100544 -0.09562249  0.1914179
-  -0.18710697  0.08637096 -0.08076215 -0.08568095]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[-0.01371119 -0.16339102  0.0983357  
-0.17663044  0.01454208  0.11325476
+  -0.12773928 -0.15076812  0.13691261  0.15500408]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/get_started/tutorials/quick_start.html 
b/docs/get_started/tutorials/quick_start.html
index 55fd299666a..4b16dcf0ad2 100644
--- a/docs/get_started/tutorials/quick_start.html
+++ b/docs/get_started/tutorials/quick_start.html
@@ -452,16 +452,16 @@ different devices.</p>
 <a href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">target</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span 
class="o">.</span><span class="n">target< [...]
 <a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">compile</span [...]
 <span class="n">device</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">784</span><span 
class="p">)</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;float32&quot;</span><span 
class="p">)</span>
 <span class="n">tvm_data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">data</span><span class="p">,</span> <span 
class="n">device</span><span class="o">=</span><span 
class="n">device</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">np</span><span class="o">.</span><span 
class="n">random</span><span class="o">.</span><span class="n">rand</span><span 
class="p">(</span><span class="o">*</span><span class="n">param</span><span 
class="o">.</sp [...]
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">param</span><span class="p">,</span> <span class="n"> [...]
-<span class="nb">print</span><span class="p">(</span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;forward&quot;</span><span 
class="p">](</span><span class="n">tvm_data</span><span class="p">,</span> 
<span class="o">*</span><a href="http [...]
+<span class="nb">print</span><span class="p">(</span><span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;forward&quot;</span><span class="p">](</span><span 
class="n">tvm_data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">params</span></a><span class="p">)</span><s [...]
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[25488.164 24824.139 24354.932 25591.982 
24814.775 26580.848 23389.242
-  25353.81  24581.1   23791.096]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[24241.857 24889.234 25745.719 25249.332 
25855.055 25478.053 26986.88
+  25268.55  24716.8   24717.395]]
 </pre></div>
 </div>
 <p>Our goal is to bring machine learning to the application with any language 
of interest,
@@ -469,8 +469,8 @@ with the minimum runtime support.</p>
 <ul>
 <li><p>Each function in IRModule becomes a runnable function in the runtime. 
For example in LLM
 cases, we can call <code class="docutils literal notranslate"><span 
class="pre">prefill</span></code> and <code class="docutils literal 
notranslate"><span class="pre">decode</span></code> functions directly.</p>
-<div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;prefill&quot;</span><span 
class="p">](</span> [...]
-<span class="n">decoded_logits</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;decode&quot;</span><span 
class="p">](</span><span class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span>< [...]
+<div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">prefill_logits</span> <span 
class="o">=</span> <span class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;prefill&quot;</span><span class="p">](</span><span 
class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">)</span>
+<span class="n">decoded_logits</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;decode&quot;</span><span class="p">](</span><span 
class="n">inputs</span><span class="p">,</span> <span 
class="n">weight</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">)</span>
 </pre></div>
 </div>
 </li>
@@ -485,15 +485,15 @@ copy exchange with existing ecosystem (DLPack exchange 
with PyTorch)</p>
 </li>
 <li><p>TVM runtime works in non-python environments, so it works on settings 
such as mobile</p>
 <div class="highlight-C++ notranslate"><div 
class="highlight"><pre><span></span><span class="c1">// C++ snippet</span>
-<span class="n">runtime</span><span class="o">::</span><span 
class="n">Module</span><span class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w"> 
</span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.r [...]
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="n">GetFunction</span><span 
class="p">(</span><span class="s">&quot;init&quot;</span><span 
class="p">)(...);</span>
-<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="n">GetFunction</span><span 
class="p">(</span><span [...]
+<span class="n">runtime</span><span class="o">::</span><span 
class="n">Module</span><span class="w"> </span><span class="n">vm</span><span 
class="w"> </span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a><span class="p">.</span><span class="n">GetFunction [...]
+<span class="n">vm</span><span class="p">.</span><span 
class="n">GetFunction</span><span class="p">(</span><span 
class="s">&quot;init&quot;</span><span class="p">)(...);</span>
+<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><span class="n">vm</span><span class="p">.</span><span 
class="n">GetFunction</span><span class="p">(</span><span 
class="s">&quot;prefill&quot;</span><span class="p">)(</span><span 
class="n">data</span><span class="p">,</span><span class="w"> </span><span 
class="n">weight</span><span class="p">,</span><span class="w"> </span><span 
class="n" [...]
 </pre></div>
 </div>
 <div class="highlight-Java notranslate"><div 
class="highlight"><pre><span></span><span class="c1">// Java snippet</span>
-<span class="n">Module</span><span class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span class="w"> 
</span><span class="o">=</span><span class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="na">getFunction</span><span 
class="p">(</span><span class="s">&quot;init&quot;</span><span 
class="p">).</span><span class="na">pushArg</span><span 
class="p">(...).</span><span class="na">invoke</span>< [...]
-<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">.</span><span class="na">getFunction</span><span 
class="p">(</span><spa [...]
+<span class="n">Module</span><span class="w"> </span><span 
class="n">vm</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span class="s">&quot;l 
[...]
+<span class="n">vm</span><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span 
class="s">&quot;init&quot;</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(...).</span><span 
class="na">invoke</span><span class="p">;</span>
+<span class="n">Tensor</span><span class="w"> </span><span 
class="n">out</span><span class="w"> </span><span class="o">=</span><span 
class="w"> </span><span class="n">vm</span><span class="p">.</span><span 
class="na">getFunction</span><span class="p">(</span><span 
class="s">&quot;prefill&quot;</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(</span><span 
class="n">data</span><span class="p">).</span><span 
class="na">pushArg</span><span class="p">(</span><spa [...]
 </pre></div>
 </div>
 </li>
diff --git a/docs/get_started/tutorials/sg_execution_times.html 
b/docs/get_started/tutorials/sg_execution_times.html
index 9c71ef5ba2c..d914b09fa66 100644
--- a/docs/get_started/tutorials/sg_execution_times.html
+++ b/docs/get_started/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span 
id="sphx-glr-get-started-tutorials-sg-execution-times"></span><h1>Computation 
times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:06.237</strong> total execution time for 2 files <strong>from 
get_started/tutorials</strong>:</p>
+<p><strong>00:07.509</strong> total execution time for 2 files <strong>from 
get_started/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,11 +319,11 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span 
class="std std-ref">IRModule</span></a> (<code class="docutils literal 
notranslate"><span class="pre">ir_module.py</span></code>)</p></td>
-<td><p>00:06.082</p></td>
+<td><p>00:07.352</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span 
class="std std-ref">Quick Start</span></a> (<code class="docutils literal 
notranslate"><span class="pre">quick_start.py</span></code>)</p></td>
-<td><p>00:00.156</p></td>
+<td><p>00:00.157</p></td>
 <td><p>0.0</p></td>
 </tr>
 </tbody>
diff --git a/docs/how_to/tutorials/bring_your_own_codegen.html 
b/docs/how_to/tutorials/bring_your_own_codegen.html
index 91976b99c85..79d36c8340d 100644
--- a/docs/how_to/tutorials/bring_your_own_codegen.html
+++ b/docs/how_to/tutorials/bring_your_own_codegen.html
@@ -469,7 +469,7 @@ and <code class="docutils literal notranslate"><span 
class="pre">USE_EXAMPLE_NPU
     <span class="k">with</span> <a 
href="../../reference/api/python/transform.html#tvm.ir.transform.PassContext" 
title="tvm.ir.transform.PassContext" 
class="sphx-glr-backref-module-tvm-ir-transform 
sphx-glr-backref-type-py-class"><span class="n">tvm</span><span 
class="o">.</span><span class="n">transform</span><span class="o">.</span><span 
class="n">PassContext</span></a><span class="p">(</span><span 
class="n">opt_level</span><span class="o">=</span><span 
class="mi">3</span><span class=" [...]
         <span class="n">built</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.build" 
title="tvm.relax.build" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-function"><span class="n">relax</span><span 
class="o">.</span><span class="n">build</span></a><span class="p">(</span><span 
class="n">mod</span><span class="p">,</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="s [...]
 
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">built</span><span class="p">,</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cp [...]
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">built</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span  [...]
     <span class="n">result</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">x_np</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span  
[...]
 
     <span class="k">assert</span> <span class="n">result</span><span 
class="o">.</span><span class="n">numpy</span><span class="p">()</span><span 
class="o">.</span><span class="n">shape</span> <span class="o">==</span> <span 
class="p">(</span><span class="mi">2</span><span class="p">,</span> <span 
class="mi">8</span><span class="p">)</span>
@@ -509,7 +509,7 @@ priority), both ops are offloaded as a single composite 
function.</p>
     <span class="n">x2_np</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">randn</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span 
class="mi">32</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><s [...]
     <span class="n">w2_np</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">randn</span><span class="p">(</span><span 
class="mi">16</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span 
class="mi">3</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><sp [...]
 
-    <span class="n">vm2</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">built2</span><span class="p">,</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n"> [...]
+    <span class="n">vm2</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">built2</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><spa [...]
     <span class="n">result2</span> <span class="o">=</span> <span 
class="n">vm2</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span>
         <span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">x2_np</span><span class="p">,</span> <span class="n">tvm</span><span 
class="o">.</span><span class="n">cpu</span><span class="p">()),</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class=" [...]
     <span class="p">)</span>
diff --git a/docs/how_to/tutorials/cross_compilation_and_rpc.html 
b/docs/how_to/tutorials/cross_compilation_and_rpc.html
index ec6e989356d..eb5d8fc8dcf 100644
--- a/docs/how_to/tutorials/cross_compilation_and_rpc.html
+++ b/docs/how_to/tutorials/cross_compilation_and_rpc.html
@@ -474,7 +474,7 @@ device and returns the measured cost. Network overhead is 
excluded.</p>
 <span class="nb">print</span><span class="p">(</span><span 
class="sa">f</span><span class="s2">&quot;</span><span class="si">{</span><span 
class="n">cost</span><span class="si">:</span><span class="s2">g</span><span 
class="si">}</span><span class="s2"> secs/op&quot;</span><span 
class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>9.7e-08 secs/op
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>9.3e-08 secs/op
 </pre></div>
 </div>
 <dl>
@@ -990,8 +990,8 @@ for ONNX models. Simply replace <code class="docutils 
literal notranslate"><span
 Converted PyTorch model to Relax:
   - Number of parameters: 4
 Using local target for demonstration
-Exported library to: /tmp/tmpj7lc8xaw/model_deployed.so
-Saved parameters to: /tmp/tmpj7lc8xaw/model_params.npz
+Exported library to: /tmp/tmpiw5zwg5w/model_deployed.so
+Saved parameters to: /tmp/tmpiw5zwg5w/model_params.npz
 
 RPC workflow (works for any remote device):
 ==================================================
diff --git a/docs/how_to/tutorials/customize_opt.html 
b/docs/how_to/tutorials/customize_opt.html
index 3f83ba34d22..fa3cf221600 100644
--- a/docs/how_to/tutorials/customize_opt.html
+++ b/docs/how_to/tutorials/customize_opt.html
@@ -613,16 +613,16 @@ pushing the performance to the limit. The current 
optimization may not be the be
 <p>We can build and deploy the optimized model to the TVM runtime.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-functi [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
 <span class="c1"># Need to allocate data and params on GPU device</span>
 <span class="n">data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-ba [...]
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">np</span><span 
class="o">.</span><span class="n"> [...]
-<span class="n">gpu_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;forward&quot;</span><span 
class="p">](</span><span class="n">data</span><span class="p">,</span> <span 
class="o">*</span><a href="https [...]
+<span class="n">gpu_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;forward&quot;</span><span class="p">](</span><span 
class="n">data</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">gpu_params</span></a><span class="p">)</span [...]
 <span class="nb">print</span><span class="p">(</span><span 
class="n">gpu_out</span><span class="p">)</span>
 </pre></div>
 </div>
-<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[25360.273 25977.055 26176.404 26010.953 
23686.574 24671.594 25639.94
-  26352.355 26981.152 24812.078]]
+<div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>[[24617.4   24113.621 23790.748 25276.402 
22978.486 24189.844 24161.701
+  25012.629 24334.504 21885.848]]
 </pre></div>
 </div>
 </section>
diff --git a/docs/how_to/tutorials/e2e_opt_model.html 
b/docs/how_to/tutorials/e2e_opt_model.html
index 3118b49a968..239330ad45e 100644
--- a/docs/how_to/tutorials/e2e_opt_model.html
+++ b/docs/how_to/tutorials/e2e_opt_model.html
@@ -332,8 +332,8 @@ PyTorch.</p>
 <div class="sphx-glr-script-out highlight-none notranslate"><div 
class="highlight"><pre><span></span>Downloading: 
&quot;https://download.pytorch.org/models/resnet18-f37072fd.pth&quot; to 
/workspace/.cache/torch/hub/checkpoints/resnet18-f37072fd.pth
 
   0%|          | 0.00/44.7M [00:00&lt;?, ?B/s]
- 66%|██████▌   | 29.4M/44.7M [00:00&lt;00:00, 308MB/s]
-100%|██████████| 44.7M/44.7M [00:00&lt;00:00, 336MB/s]
+ 67%|██████▋   | 29.8M/44.7M [00:00&lt;00:00, 311MB/s]
+100%|██████████| 44.7M/44.7M [00:00&lt;00:00, 338MB/s]
 </pre></div>
 </div>
 </section>
@@ -434,7 +434,7 @@ We skip this step in the CI environment.</p>
         <span class="n">mod</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">s_tir</span><span 
class="o">.</span><span class="n">transform</span><span class="o">.</span><span 
class="n">DefaultGPUSchedule</span><span class="p">()(</span><span 
class="n">mod</span><span class="p">)</span>
     <span class="n">ex</span> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span 
class="n">compile</span></a><span class="p">(</span><span 
class="n">mod</span><span class="p">,</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm [...]
     <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">device</span><span 
class="p">(</span><span class="s2">&quot;cuda&quot;</span><span 
class="p">,</span> <span class="mi">0</span><span class="p">)</span>
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">ex</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">ex</span><span class="p">,</span> <span class="n">dev</span><span 
class="p">)</span>
     <span class="c1"># Need to allocate data and params on GPU device</span>
     <span class="n">gpu_data</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">224< [...]
     <span class="n">gpu_params</span> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n">dev</span><span class="p">)</span> <span 
class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span 
class="n">params</span><span class="p">[</span><span class="s2" [...]
diff --git a/docs/how_to/tutorials/export_and_load_executable.html 
b/docs/how_to/tutorials/export_and_load_executable.html
index 9f6dd6da4f7..7374773bb04 100644
--- a/docs/how_to/tutorials/export_and_load_executable.html
+++ b/docs/how_to/tutorials/export_and_load_executable.html
@@ -446,7 +446,7 @@ runtime module directly.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="k">if</span> <a 
href="https://docs.python.org/3/library/functions.html#bool"; 
title="builtins.bool" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">RUN_EXAMPLE</span></a><span class="p">:</span>
     <span class="n">loaded_rt_mod</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">load_module</span><span 
class="p">(</span><span class="nb">str</span><span class="p">(</span><span 
class="n">library_path</span><span class="p">))</span>
     <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">loaded_rt_mod</span><span class="p">,</span> 
<span class="n">dev</span><span class="p">)</span>
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">loaded_rt_mod</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
 
     <span class="c1"># Prepare input data</span>
     <span class="n">input_tensor</span> <span class="o">=</span> <span 
class="n">torch</span><span class="o">.</span><span class="n">randn</span><span 
class="p">(</span><span class="mi">1</span><span class="p">,</span> <span 
class="mi">1</span><span class="p">,</span> <span class="mi">28</span><span 
class="p">,</span> <span class="mi">28</span><span class="p">,</span> <span 
class="n">dtype</span><span class="o">=</span><span class="n">torch</span><span 
class="o">.</span><span class="n">f [...]
@@ -527,7 +527,7 @@ of how to reload and run the model. Save this as <code 
class="docutils literal n
 
 <span class="c1"># Step 2: Create Virtual Machine</span>
 <span class="n">device</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">(</span><span class="mi">0</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">device</span><span class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span 
class="p">)</span>
 
 <span class="c1"># Step 3: Load parameters from the .npz file</span>
 <span class="n">params_npz</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">load</span><span 
class="p">(</span><span 
class="s2">&quot;relax_export_artifacts/model_params.npz&quot;</span><span 
class="p">)</span>
@@ -562,7 +562,7 @@ To run on GPU instead of CPU, make the following 
changes:</p>
 </li>
 <li><p><strong>Use GPU device in the script</strong>:</p>
 <div class="highlight-python notranslate"><div 
class="highlight"><pre><span></span><span class="n">device</span> <span 
class="o">=</span> <span class="n">tvm</span><span class="o">.</span><span 
class="n">cuda</span><span class="p">(</span><span class="mi">0</span><span 
class="p">)</span>  <span class="c1"># Use CUDA device instead of CPU</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">device</span><span class="p">)</span>
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">device</span><span 
class="p">)</span>
 
 <span class="c1"># Load parameters to GPU</span>
 <span class="n">params</span> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span 
class="n">params_npz</span><span class="p">[</span><span 
class="sa">f</span><span class="s2">&quot;p_</span><span 
class="si">{</span><span class="n">i</span><span class="si">}</span><span 
class="s2">&quot;</span><span class="p">],</span> <span class= [...]
@@ -625,7 +625,7 @@ for a comprehensive guide on:</p>
 
 <span class="c1"># Step 4: Load and run on remote device</span>
 <span class="n">lib</span> <span class="o">=</span> <span 
class="n">remote</span><span class="o">.</span><span 
class="n">load_module</span><span class="p">(</span><span 
class="s2">&quot;mlp_arm.so&quot;</span><span class="p">)</span>
-<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class="n">relax</span><span 
class="o">.</span><span class="n">VirtualMachine</span></a><span 
class="p">(</span><span class="n">lib</span><span class="p">,</span> <span 
class="n">remote</span><span class="o">.</span><span class="n">cpu</ [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><span 
class="n">lib</span><span class="p">,</span> <span class="n">remote</span><span 
class="o">.</span><span cla [...]
 <span class="c1"># ... prepare input and params, then run inference</span>
 </pre></div>
 </div>
diff --git a/docs/how_to/tutorials/import_model.html 
b/docs/how_to/tutorials/import_model.html
index 3d89cb7fb29..bf683b39989 100644
--- a/docs/how_to/tutorials/import_model.html
+++ b/docs/how_to/tutorials/import_model.html
@@ -530,13 +530,13 @@ shown below.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="n">mod_compiled</span> <span 
class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.get_pipeline" 
title="tvm.relax.get_pipeline" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-function"><span class="n">relax</span><span 
class="o">.</span><span class="n">get_pipeline</span></a><span 
class="p">(</span><span class="s2">&quot;zero&quot;</span><span cla [...]
 <a href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">exec_module</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">comp [...]
 <span class="n">dev</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">cpu</span><span 
class="p">()</span>
-<a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span class=" [...]
+<span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class [...]
 
 <span class="c1"># Run inference</span>
 <span class="n">input_data</span> <span class="o">=</span> <span 
class="n">np</span><span class="o">.</span><span class="n">random</span><span 
class="o">.</span><span class="n">rand</span><span class="p">(</span><span 
class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span 
class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span 
class="mi">32</span><span class="p">)</span><span class="o">.</span><span 
class="n">astype</span><span class="p">(</span><s [...]
 <span class="n">tvm_input</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">input_data</span><span class="p">,</span> <span 
class="n">dev</span><span class="p">)</span>
 <a href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">tvm_params</span></a> <span class="o">=</span> <span 
class="p">[</span><span class="n">tvm</span><span class="o">.</span><span 
class="n">runtime</span><span class="o">.</span><span 
class="n">tensor</span><span class="p">(</span><span class="n">p</span><span 
class="p">,</span> <span class="n"> [...]
-<span class="n">tvm_out</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;main&quot;</span><span 
class="p">](</span><span class="n">tvm_input</span><span class="p">,</span> 
<span class="o">*</span><a href="htt [...]
+<span class="n">tvm_out</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;main&quot;</span><span class="p">](</span><span 
class="n">tvm_input</span><span class="p">,</span> <span class="o">*</span><a 
href="https://docs.python.org/3/library/stdtypes.html#list"; 
title="builtins.list" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">tvm_params</span></a><span class="p">)</sp [...]
 
 <span class="c1"># Compare with PyTorch</span>
 <span class="k">with</span> <span class="n">torch</span><span 
class="o">.</span><span class="n">no_grad</span><span class="p">():</span>
diff --git a/docs/how_to/tutorials/optimize_llm.html 
b/docs/how_to/tutorials/optimize_llm.html
index 28404dde3d2..7cec203820f 100644
--- a/docs/how_to/tutorials/optimize_llm.html
+++ b/docs/how_to/tutorials/optimize_llm.html
@@ -728,7 +728,7 @@ is designed specifically for the LLMs.</p>
 
 <span class="k">with</span> <a 
href="../../reference/api/python/target.html#tvm.target.Target" 
title="tvm.target.Target" class="sphx-glr-backref-module-tvm-target 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">target</span></a><span class="p">:</span>
     <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">ex</span></a> <span class="o">=</span> <a 
href="../../reference/api/python/driver.html#tvm.compile" title="tvm.compile" 
class="sphx-glr-backref-module-tvm sphx-glr-backref-type-py-function"><span 
class="n">tvm</span><span class="o">.</span><span class="n">compile</ [...]
-    <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a> <span 
class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm 
sphx-glr-backref-type-py-class"><span cla [...]
+    <span class="n">vm</span> <span class="o">=</span> <a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VirtualMachine" 
title="tvm.relax.VirtualMachine" class="sphx-glr-backref-module-tvm-relax 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">relax</span><span class="o">.</span><span 
class="n">VirtualMachine</span></a><span class="p">(</span><a 
href="../../reference/api/python/relax/relax.html#tvm.relax.VMExecutable" 
title="tvm.relax.VMExecutable" c [...]
 </pre></div>
 </div>
 </section>
@@ -826,7 +826,7 @@ the model documentation for the correct tokenization and 
prompt format.</p>
 key and value tensors for the attention layer. Apache TVM provides a 
PagedKVCache to store the
 key and value tensors. We create the PagedKVCache with the specified 
parameters.</p>
 <div class="highlight-Python notranslate"><div 
class="highlight"><pre><span></span><span class="k">if</span> <span 
class="ow">not</span> <a 
href="https://docs.python.org/3/library/functions.html#bool"; 
title="builtins.bool" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class sphx-glr-backref-instance"><span 
class="n">IS_IN_CI</span></a><span class="p">:</span>
-    <span class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span 
class="s2">&quot;create_tir_paged_kv_cache&quot;</span><span class="p">](</span>
+    <span class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;create_tir_paged_kv_cache&quot;</span><span class="p">](</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">1</span><span class="p">]),</span>  <span 
class="c1"># max_batch_size=1</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>  
<span class="c1"># max_total_seq_len=2048</span>
         <a href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">2048</span><span class="p">]),</span>  
<span class="c1"># prefill_chunk_size=2048</span>
@@ -843,7 +843,7 @@ compiled in the Relax IRModule to embed the tokens into the 
hidden states.</p>
 
 
 <span class="k">def</span><span class="w"> </span><span 
class="nf">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">):</span>
-    <span class="n">_embed</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;embed&quot;</span><span 
class="p">](</span><span class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span  [...]
+    <span class="n">_embed</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;embed&quot;</span><span class="p">](</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
     <span class="c1"># Reshape hidden from [seq_len, hidden_size] to [1, 
seq_len, hidden_size]</span>
     <span class="n">_embed</span> <span class="o">=</span> <span 
class="n">nd_view_func</span><span class="p">(</span><span 
class="n">_embed</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="mi">1</span><span class="p">,</span> <span 
class="n">_embed</span><span class="o">.</span>< [...]
     <span class="k">return</span> <span class="n">_embed</span>
@@ -866,7 +866,7 @@ and <cite>end_forward_func</cite> to end the forward 
pass.</p>
     <span class="n">add_sequence_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">seq_id</span><span class="p">)</span>
     <span class="n">hidden_states</span> <span class="o">=</span> <span 
class="n">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
     <span class="n">begin_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="s [...]
-    <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;prefill&quot;</span><span 
class="p">](</span><span class="n">hidden_states</ [...]
+    <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;prefill&quot;</span><span class="p">](</span><span 
class="n">hidden_states</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
     <span class="n">end_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">)</span>
 </pre></div>
 </div>
@@ -898,7 +898,7 @@ IRModule to generate the token.</p>
         <span class="n">tokens</span> <span class="o">=</span> <span 
class="n">tvm</span><span class="o">.</span><span class="n">runtime</span><span 
class="o">.</span><span class="n">tensor</span><span class="p">(</span><span 
class="n">np</span><span class="o">.</span><span class="n">array</span><span 
class="p">([</span><span class="n">last_token</span><span 
class="p">])</span><span class="o">.</span><span class="n">astype</span><span 
class="p">(</span><span class="s2">&quot;int32&quot;< [...]
         <span class="n">hidden_states</span> <span class="o">=</span> <span 
class="n">embed</span><span class="p">(</span><span 
class="n">tokens</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
         <span class="n">begin_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">,</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" class="sphx-glr-backref-module-builtins 
sphx-glr-backref-type-py-class"><span class="n">Shape</span></a><span 
class="p">([</span><span class="n">seq_id</span><span class="p">]),</span> <a 
href="https://docs.python.org/3/library/stdtypes.html#tuple"; 
title="builtins.tuple" clas [...]
-        <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <a 
href="../../reference/api/python/runtime/vm.html#tvm.runtime.vm.VirtualMachine" 
title="tvm.runtime.vm.VirtualMachine" 
class="sphx-glr-backref-module-tvm-runtime-vm sphx-glr-backref-type-py-class 
sphx-glr-backref-instance"><span class="n">vm</span></a><span 
class="p">[</span><span class="s2">&quot;decode&quot;</span><span 
class="p">](</span><span class="n">hidden_state [...]
+        <span class="n">logits</span><span class="p">,</span> <span 
class="n">kv_cache</span> <span class="o">=</span> <span 
class="n">vm</span><span class="p">[</span><span 
class="s2">&quot;decode&quot;</span><span class="p">](</span><span 
class="n">hidden_states</span><span class="p">,</span> <span 
class="n">kv_cache</span><span class="p">,</span> <span 
class="n">params</span><span class="p">)</span>
 
         <span class="n">end_forward_func</span><span class="p">(</span><span 
class="n">kv_cache</span><span class="p">)</span>
         <span class="n">last_token</span> <span class="o">=</span> <span 
class="n">sample_token</span><span class="p">(</span><span 
class="n">logits</span><span class="p">)</span>
diff --git a/docs/how_to/tutorials/sg_execution_times.html 
b/docs/how_to/tutorials/sg_execution_times.html
index 08aa0b128b3..63504becdc3 100644
--- a/docs/how_to/tutorials/sg_execution_times.html
+++ b/docs/how_to/tutorials/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span id="sphx-glr-how-to-tutorials-sg-execution-times"></span><h1>Computation 
times<a class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:15.518</strong> total execution time for 8 files <strong>from 
how_to/tutorials</strong>:</p>
+<p><strong>00:18.827</strong> total execution time for 8 files <strong>from 
how_to/tutorials</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,27 +319,27 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span 
class="std std-ref">Optimize Large Language Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.311</p></td>
+<td><p>00:10.447</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span 
class="std std-ref">Importing Models from ML Frameworks</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">import_model.py</span></code>)</p></td>
-<td><p>00:03.353</p></td>
+<td><p>00:06.478</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span 
class="std std-ref">Customize Optimization</span></a> (<code class="docutils 
literal notranslate"><span class="pre">customize_opt.py</span></code>)</p></td>
-<td><p>00:00.745</p></td>
+<td><p>00:00.712</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span 
class="std std-ref">End-to-End Optimize Model</span></a> (<code class="docutils 
literal notranslate"><span class="pre">e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.584</p></td>
+<td><p>00:00.665</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
 class="std std-ref">Cross Compilation and RPC</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.479</p></td>
+<td><p>00:00.476</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
 class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.040</p></td>
+<td><p>00:00.042</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="mix_python_and_tvm_with_pymodule.html#sphx-glr-how-to-tutorials-mix-python-and-tvm-with-pymodule-py"><span
 class="std std-ref">Mix Python/PyTorch with TVM Using BasePyModule</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">mix_python_and_tvm_with_pymodule.py</span></code>)</p></td>
diff --git a/docs/objects.inv b/docs/objects.inv
index 8cedefd68df..54ee948d55e 100644
Binary files a/docs/objects.inv and b/docs/objects.inv differ
diff --git a/docs/reference/api/python/relax/op.html 
b/docs/reference/api/python/relax/op.html
index 7c0a739340f..f3e4f9782a2 100644
--- a/docs/reference/api/python/relax/op.html
+++ b/docs/reference/api/python/relax/op.html
@@ -2526,7 +2526,7 @@ Defaults to “ij”.</p></li>
 
 <dl class="py function">
 <dt class="sig sig-object py" id="tvm.relax.op.layout_transform">
-<span class="sig-prename descclassname"><span 
class="pre">tvm.relax.op.</span></span><span class="sig-name descname"><span 
class="pre">layout_transform</span></span><span class="sig-paren">(</span><em 
class="sig-param"><span class="n"><span class="pre">x</span></span><span 
class="p"><span class="pre">:</span></span><span class="w"> </span><span 
class="n"><a class="reference internal" href="../ir.html#tvm.ir.RelaxExpr" 
title="tvm.ir.expr.RelaxExpr"><span class="pre">RelaxExpr</span></a></ [...]
+<span class="sig-prename descclassname"><span 
class="pre">tvm.relax.op.</span></span><span class="sig-name descname"><span 
class="pre">layout_transform</span></span><span class="sig-paren">(</span><em 
class="sig-param"><span class="n"><span class="pre">x</span></span><span 
class="p"><span class="pre">:</span></span><span class="w"> </span><span 
class="n"><a class="reference internal" href="../ir.html#tvm.ir.RelaxExpr" 
title="tvm.ir.expr.RelaxExpr"><span class="pre">RelaxExpr</span></a></ [...]
 <dd><p>Modifies the layout of a tensor.</p>
 <dl class="field-list simple">
 <dt class="field-odd">Parameters<span class="colon">:</span></dt>
diff --git a/docs/reference/api/python/relax/training.html 
b/docs/reference/api/python/relax/training.html
index 986b684f71a..a170f682326 100644
--- a/docs/reference/api/python/relax/training.html
+++ b/docs/reference/api/python/relax/training.html
@@ -435,7 +435,7 @@ relax.transform.AppendLoss.</p></li>
 
 <dl class="py class">
 <dt class="sig sig-object py" id="tvm.relax.training.Trainer">
-<em class="property"><span class="pre">class</span><span class="w"> 
</span></em><span class="sig-prename descclassname"><span 
class="pre">tvm.relax.training.</span></span><span class="sig-name 
descname"><span class="pre">Trainer</span></span><span 
class="sig-paren">(</span><em class="sig-param"><span class="n"><span 
class="pre">train_mod</span></span><span class="p"><span 
class="pre">:</span></span><span class="w"> </span><span class="n"><a 
class="reference internal" href="../ir.html#tvm [...]
+<em class="property"><span class="pre">class</span><span class="w"> 
</span></em><span class="sig-prename descclassname"><span 
class="pre">tvm.relax.training.</span></span><span class="sig-name 
descname"><span class="pre">Trainer</span></span><span 
class="sig-paren">(</span><em class="sig-param"><span class="n"><span 
class="pre">train_mod</span></span><span class="p"><span 
class="pre">:</span></span><span class="w"> </span><span class="n"><a 
class="reference internal" href="../ir.html#tvm [...]
 <dd><p>Unified wrapper for relax training. It accepts the IRModule (that is 
the result of
 SetupTrainer) and the relax VM (that contains the built result of the 
IRModule), and helps run
 the VM. It maintains the parameters, the model states and the optimizer states 
internally.</p>
diff --git a/docs/reference/api/python/runtime/vm.html 
b/docs/reference/api/python/runtime/vm.html
index 9cee29db9f2..48f212af53c 100644
--- a/docs/reference/api/python/runtime/vm.html
+++ b/docs/reference/api/python/runtime/vm.html
@@ -509,7 +509,7 @@ more details.</p>
 <div class="admonition seealso">
 <p class="admonition-title">See also</p>
 <dl class="simple">
-<dt><a class="reference internal" 
href="#tvm.runtime.vm.VMInstrumentReturnKind" 
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj 
docutils literal notranslate"><span 
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible 
return values in VM.</p>
+<dt><a class="reference internal" 
href="../relax/relax.html#tvm.relax.VMInstrumentReturnKind" 
title="tvm.runtime.vm.VMInstrumentReturnKind"><code class="xref py py-obj 
docutils literal notranslate"><span 
class="pre">VMInstrumentReturnKind</span></code></a></dt><dd><p>the possible 
return values in VM.</p>
 </dd>
 </dl>
 </div>
diff --git a/docs/searchindex.js b/docs/searchindex.js
index 8909d8216c4..beb3d7b3fec 100644
--- a/docs/searchindex.js
+++ b/docs/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles": {"API Reference": [[57, null]], "About": [[57, 
null]], "Adding a New Backend": [[2, "adding-a-new-backend"]], "Adding a New 
Docker Image": [[11, "adding-a-new-docker-image"]], "Additional 
Recommendations": [[13, "additional-recommendations"]], "Advanced Build 
Configuration": [[59, "advanced-build-configuration"]], "Algorithm": [[3, 
"algorithm"]], "Alternative PyTorch import methods": [[52, 
"alternative-pytorch-import-methods"]], "Apache TVM Documentation": [ [...]
\ No newline at end of file
+Search.setIndex({"alltitles": {"API Reference": [[57, null]], "About": [[57, 
null]], "Adding a New Backend": [[2, "adding-a-new-backend"]], "Adding a New 
Docker Image": [[11, "adding-a-new-docker-image"]], "Additional 
Recommendations": [[13, "additional-recommendations"]], "Advanced Build 
Configuration": [[59, "advanced-build-configuration"]], "Algorithm": [[3, 
"algorithm"]], "Alternative PyTorch import methods": [[52, 
"alternative-pytorch-import-methods"]], "Apache TVM Documentation": [ [...]
\ No newline at end of file
diff --git a/docs/sg_execution_times.html b/docs/sg_execution_times.html
index 1b3f031880f..ef03b1b7a70 100644
--- a/docs/sg_execution_times.html
+++ b/docs/sg_execution_times.html
@@ -297,7 +297,7 @@
             
   <section id="computation-times">
 <span id="sphx-glr-sg-execution-times"></span><h1>Computation times<a 
class="headerlink" href="#computation-times" title="Link to this 
heading"></a></h1>
-<p><strong>00:22.663</strong> total execution time for 16 files <strong>from 
all galleries</strong>:</p>
+<p><strong>00:27.255</strong> total execution time for 16 files <strong>from 
all galleries</strong>:</p>
 <div class="docutils container">
 <style scoped>
 <link 
href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/5.3.0/css/bootstrap.min.css";
 rel="stylesheet" />
@@ -319,47 +319,47 @@ $(document).ready( function () {
 </thead>
 <tbody>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/optimize_llm.html#sphx-glr-how-to-tutorials-optimize-llm-py"><span
 class="std std-ref">Optimize Large Language Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/optimize_llm.py</span></code>)</p></td>
-<td><p>00:10.311</p></td>
+<td><p>00:10.447</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="get_started/tutorials/ir_module.html#sphx-glr-get-started-tutorials-ir-module-py"><span
 class="std std-ref">IRModule</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../get_started/tutorials/ir_module.py</span></code>)</p></td>
-<td><p>00:06.082</p></td>
+<td><p>00:07.352</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/import_model.html#sphx-glr-how-to-tutorials-import-model-py"><span
 class="std std-ref">Importing Models from ML Frameworks</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/import_model.py</span></code>)</p></td>
-<td><p>00:03.353</p></td>
+<td><p>00:06.478</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/customize_opt.html#sphx-glr-how-to-tutorials-customize-opt-py"><span
 class="std std-ref">Customize Optimization</span></a> (<code class="docutils 
literal notranslate"><span 
class="pre">../how_to/tutorials/customize_opt.py</span></code>)</p></td>
-<td><p>00:00.745</p></td>
+<td><p>00:00.712</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/e2e_opt_model.html#sphx-glr-how-to-tutorials-e2e-opt-model-py"><span
 class="std std-ref">End-to-End Optimize Model</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/e2e_opt_model.py</span></code>)</p></td>
-<td><p>00:00.584</p></td>
+<td><p>00:00.665</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="how_to/tutorials/cross_compilation_and_rpc.html#sphx-glr-how-to-tutorials-cross-compilation-and-rpc-py"><span
 class="std std-ref">Cross Compilation and RPC</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/cross_compilation_and_rpc.py</span></code>)</p></td>
-<td><p>00:00.479</p></td>
+<td><p>00:00.476</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/tir_transformation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/tir_transformation.py</span></code>)</p></td>
-<td><p>00:00.296</p></td>
+<td><p>00:00.301</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/tir_creation.html#sphx-glr-deep-dive-tensor-ir-tutorials-tir-creation-py"><span
 class="std std-ref">TensorIR Creation</span></a> (<code class="docutils 
literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/tir_creation.py</span></code>)</p></td>
-<td><p>00:00.278</p></td>
+<td><p>00:00.282</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="get_started/tutorials/quick_start.html#sphx-glr-get-started-tutorials-quick-start-py"><span
 class="std std-ref">Quick Start</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../get_started/tutorials/quick_start.py</span></code>)</p></td>
-<td><p>00:00.156</p></td>
+<td><p>00:00.157</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/relax/tutorials/relax_creation.html#sphx-glr-deep-dive-relax-tutorials-relax-creation-py"><span
 class="std std-ref">Relax Creation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/relax/tutorials/relax_creation.py</span></code>)</p></td>
-<td><p>00:00.153</p></td>
+<td><p>00:00.156</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.html#sphx-glr-deep-dive-tensor-ir-tutorials-dlight-gpu-scheduling-py"><span
 class="std std-ref">DLight: Rule-Based GPU Scheduling</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/dlight_gpu_scheduling.py</span></code>)</p></td>
-<td><p>00:00.118</p></td>
+<td><p>00:00.119</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/relax/tutorials/relax_transformation.html#sphx-glr-deep-dive-relax-tutorials-relax-transformation-py"><span
 class="std std-ref">Transformation</span></a> (<code class="docutils literal 
notranslate"><span 
class="pre">../deep_dive/relax/tutorials/relax_transformation.py</span></code>)</p></td>
@@ -367,7 +367,7 @@ $(document).ready( function () {
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-even"><td><p><a class="reference internal" 
href="how_to/tutorials/bring_your_own_codegen.html#sphx-glr-how-to-tutorials-bring-your-own-codegen-py"><span
 class="std std-ref">Bring Your Own Codegen: NPU Backend Example</span></a> 
(<code class="docutils literal notranslate"><span 
class="pre">../how_to/tutorials/bring_your_own_codegen.py</span></code>)</p></td>
-<td><p>00:00.040</p></td>
+<td><p>00:00.042</p></td>
 <td><p>0.0</p></td>
 </tr>
 <tr class="row-odd"><td><p><a class="reference internal" 
href="deep_dive/tensor_ir/tutorials/meta_schedule.html#sphx-glr-deep-dive-tensor-ir-tutorials-meta-schedule-py"><span
 class="std std-ref">MetaSchedule: Search-Based Auto-Tuning</span></a> (<code 
class="docutils literal notranslate"><span 
class="pre">../deep_dive/tensor_ir/tutorials/meta_schedule.py</span></code>)</p></td>

Reply via email to