date:20230928

[tvm] branch nightly updated (cf081d9929 -> def551dfd5)

2023-09-28 Thread github-bot

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch nightly
in repository https://gitbox.apache.org/repos/asf/tvm.git


from cf081d9929 [BugFix][CPP] Fix cpp deploy bug (#15773)
 add 9d8e6fda50 [ADRENO] Minor changes for Adreno docs and help scripts 
(#15830)
 add 8b40f5d028 [FRONTEND] Fix unnecessary pylint errors (#15838)
 add def551dfd5 [CLI TOOLS][RTVM] Improve rtvm tool with new options to 
measure native performance (#15818)

No new revisions were added by this update.

Summary of changes:
 apps/cpp_rtvm/README.md   |  22 +++
 apps/cpp_rtvm/main.cc | 199 ++
 apps/cpp_rtvm/tvm_runner.cc   | 129 +
 apps/cpp_rtvm/tvm_runner.h|  24 +++-
 docs/how_to/deploy/adreno.rst |   2 +-
 tests/python/frontend/keras/test_forward.py   |   2 +-
 tests/python/frontend/oneflow/test_forward.py |   2 +-
 tests/scripts/setup-adreno-env.sh |  18 ++-
 tests/scripts/task_config_build_adreno.sh |   2 +
 9 files changed, 334 insertions(+), 66 deletions(-)

[tvm] branch main updated: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance (#15818)

2023-09-28 Thread srk

This is an automated email from the ASF dual-hosted git repository.

srk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new def551dfd5 [CLI TOOLS][RTVM] Improve rtvm tool with new options to 
measure native performance (#15818)
def551dfd5 is described below

commit def551dfd50bfff4e9d50108dc4e8027b553b8ec
Author: Siva 
AuthorDate: Fri Sep 29 10:30:20 2023 +0530

[CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native 
performance (#15818)

* [RTVM] Improve rtvm tool with new options to measure native performance

Few fixes and enhancements that affects model loading times
New options to measure performance.

* * review comments

* * review comments
---
 apps/cpp_rtvm/README.md |  22 +
 apps/cpp_rtvm/main.cc   | 199 ++--
 apps/cpp_rtvm/tvm_runner.cc | 129 +---
 apps/cpp_rtvm/tvm_runner.h  |  24 +-
 4 files changed, 316 insertions(+), 58 deletions(-)

diff --git a/apps/cpp_rtvm/README.md b/apps/cpp_rtvm/README.md
index c60a7b0e12..652d46eb58 100644
--- a/apps/cpp_rtvm/README.md
+++ b/apps/cpp_rtvm/README.md
@@ -122,6 +122,11 @@ Command line usage
 --input- Numpy file for the model input (optional and we use random of 
not given)
 --output   - Numpy file name to dump the model output as numpy
 --dump-meta- Dump model meta information
+--pre-compiled - The file name of a file where pre-compiled programs should be 
stored
+--profile  - Profile over all execution
+--dry-run  - Profile after given dry runs, default 10
+--run-count- Profile for given runs, default 50
+--zero-copy- Profile with zero copy api
 
   Example
   ./rtvm --model=keras-resnet50 --device="opencl" --dump-meta
@@ -366,3 +371,20 @@ stored. If the pre-compiled file name was passed to the 
`rtvm` then After method
 `Load`, method `UsePreCompiledProgram` is called. This method loads 
pre-compiled
 programs if the file exists. In opposite case the file will be created and
 pre-compiled programs will be saved to this file.
+
+# Performnace Profiling Options
+The tool has added few options to measure wall clock performance of the given 
model on Target natively.
+--profile : Can turn on the profiling
+--dry-run : The number of times dry run the model before mearuring the 
performance. Default value os 10
+--run-count : The number times to run the model and take an average. Default 
value is 50.
+--zero-copy: This option enables graph runtime zero copy to be used for input 
and output than byte copy to DLTensor.
+
+Performance profile options dumps information summary as given below.
+ Module Load  :27 ms
+ Graph Runtime Create :11 ms
+ Params Read  :15 ms
+ Params Set   :41 ms
+ Pre Compiled Progs Load  :24 ms
+Total Load Time :118 ms
+Average ExecTime:27 ms
+Unload Time :35.9236 ms
diff --git a/apps/cpp_rtvm/main.cc b/apps/cpp_rtvm/main.cc
index c38a5f62bd..dc3cf1c414 100644
--- a/apps/cpp_rtvm/main.cc
+++ b/apps/cpp_rtvm/main.cc
@@ -29,6 +29,7 @@
 #endif
 #include 
 
+#include 
 #include 
 #include 
 #include 
@@ -54,7 +55,11 @@ static const string kUsage =
 "--input- Numpy file for the model input (optional and we use 
random of not given)\n"
 "--output   - Numpy file name to dump the model output as numpy\n"
 "--dump-meta- Dump model meta information\n"
-"--pre-compiled - The file name of a file where pre-compiled programs 
should be stored"
+"--pre-compiled - The file name of a file where pre-compiled programs 
should be stored\n"
+"--profile  - Profile over all execution\n"
+"--dry-run  - Profile after given dry runs, default 10\n"
+"--run-count- Profile for given runs, default 50\n"
+"--zero-copy- Profile with zero copy api\n"
 "\n"
 "  Example\n"
 "  ./rtvm --model=keras-resnet50 --device=\"opencl\" --dump-meta\n"
@@ -68,6 +73,7 @@ static const string kUsage =
  * \arg input Numpy file for the model input
  * \arg output Numpy file name to dump the model output as numpy
  * \arg pre_compiled File name where pre-compiled programs should be stored
+ * \arg profile Do we profile overall execution
  */
 struct ToolArgs {
   string model;
@@ -75,7 +81,11 @@ struct ToolArgs {
   string input;
   string output;
   string pre_compiled;
-  bool dump_meta = false;
+  bool dump_meta{false};
+  bool profile{false};
+  int dry_run{10};
+  int run_count{50};
+  bool zero_copy{false};
 };
 
 /*!
@@ -89,6 +99,10 @@ void PrintArgs(const ToolArgs& args) {
   LOG(INFO) << "Output= " << args.output;
   LOG(INFO) << "Pre-compiled  = " << args.pre_compiled;
   LOG(INFO) << "Dump Metadata = " << ((args.dump_meta) ? ("True") : ("False"));
+  LOG(INFO) << "Profile   = " << ((args.profile) ? ("True") : ("False"));
+

[GitHub] [tvm] srkreddy1238 merged pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



srkreddy1238 merged PR #15818:
URL: https://github.com/apache/tvm/pull/15818


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[tvm] branch main updated: [FRONTEND] Fix unnecessary pylint errors (#15838)

2023-09-28 Thread srk

This is an automated email from the ASF dual-hosted git repository.

srk pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 8b40f5d028 [FRONTEND] Fix unnecessary pylint errors (#15838)
8b40f5d028 is described below

commit 8b40f5d028632da82bd6cbf83865041d4186b068
Author: Siva 
AuthorDate: Fri Sep 29 10:29:00 2023 +0530

[FRONTEND] Fix unnecessary pylint errors (#15838)

Handle unnecessary pylint errors from these frontends
---
 tests/python/frontend/keras/test_forward.py   | 2 +-
 tests/python/frontend/oneflow/test_forward.py | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/python/frontend/keras/test_forward.py 
b/tests/python/frontend/keras/test_forward.py
index 9d33b15a91..ba3880e186 100644
--- a/tests/python/frontend/keras/test_forward.py
+++ b/tests/python/frontend/keras/test_forward.py
@@ -28,11 +28,11 @@ from tensorflow import keras as tf_keras
 # prevent Keras from using up all gpu memory
 import keras
 
+import pytest
 import tvm
 from tvm import relay
 from tvm.contrib import graph_executor
 import tvm.testing
-import pytest
 
 if tf.executing_eagerly():
 GPUS = tf.config.experimental.list_physical_devices("GPU")
diff --git a/tests/python/frontend/oneflow/test_forward.py 
b/tests/python/frontend/oneflow/test_forward.py
index 7ddc347e86..fda5f1b723 100644
--- a/tests/python/frontend/oneflow/test_forward.py
+++ b/tests/python/frontend/oneflow/test_forward.py
@@ -20,11 +20,11 @@ import os
 
 import numpy as np
 import oneflow as flow
+from packaging import version as package_version
 import tvm
 import tvm.testing
 import tvm.topi.testing
 from tvm import relay
-from packaging import version as package_version
 
 MODEL_HOME = "test_model"

[GitHub] [tvm] srkreddy1238 merged pull request #15838: [FRONTEND] Fix unnecessary pylint errors

2023-09-28 Thread via GitHub



srkreddy1238 merged PR #15838:
URL: https://github.com/apache/tvm/pull/15838


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] vinx13 commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



vinx13 commented on PR #15839:
URL: https://github.com/apache/tvm/pull/15839#issuecomment-1740116164

   That's correct. It's relying on the current behavior of 
`StaticBlockPlanMemory` that doesn't rewrite the input/output tensors. So only 
tensors from `R.builtin.alloc_tensor` can be returned (similar to the current 
implementation of this pass, `StaticBlockPlanMemory` doesn't rewrite the body 
of `SeqExpr`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] Lunderberg commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



Lunderberg commented on PR #15839:
URL: https://github.com/apache/tvm/pull/15839#issuecomment-1740113779

   I'm afraid I don't quite understand.  Either `R.memory.alloc_storage` or 
`R.builtin.alloc_tensor` could be independent allocations that could be 
returned as output.  (e.g. A static memory plan allocations using 
`R.memory.alloc_storage`, reuses that buffer several times internally, before 
returning it with the final result.)
   
   Oh, I think I see.  That would mean that its relying on the current behavior 
of `StaticBlockPlanMemory`, and assumes that the user-provided parameters are 
identical across each call.  Effectively, the last kernel launch would be also 
serving as a memcpy from the memory used by the captured cudagraph to other 
memory space.  Is that understanding correct?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] zxybazh commented on a diff in pull request #15823: [Unity] Add Relax multi-device e2e cases

2023-09-28 Thread via GitHub



zxybazh commented on code in PR #15823:
URL: https://github.com/apache/tvm/pull/15823#discussion_r1340704043


##
src/tir/transforms/default_gpu_schedule.cc:
##
@@ -98,24 +98,48 @@ IRModule MarkScheduled(const IRModule& mod) {
   mod->type_definitions,  // type_definitions
   mod->import_set_,   // import_set
   mod->source_map,// map
-  mod->attrs);// attrs);
+  mod->attrs, // attrs
+  mod->global_infos); // global_infos
+}
+
+bool IsScheduledOnGPU(const BaseFunc& func) {
+  Optional target = 
func->attrs.GetAttr(tvm::attr::kTarget);
+  if (target.defined()) {

Review Comment:
   Do we need to also check target from context here, i.e., GPU target defined 
in context but not in the function attribute?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] yongwww commented on pull request #15772: [Unity][QNN] Add QNN quantize/dequantize ops

2023-09-28 Thread via GitHub



yongwww commented on PR #15772:
URL: https://github.com/apache/tvm/pull/15772#issuecomment-1740031754

   cc: @slyubomirsky @psrivas2 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] yongwww commented on pull request #15686: [Unity] Add new Relax annotation ops: smooth and absmax

2023-09-28 Thread via GitHub



yongwww commented on PR #15686:
URL: https://github.com/apache/tvm/pull/15686#issuecomment-1739990584

   @ibsidorenko thanks for the effort! looks we might not need use these two 
ops, shall we close this pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] vinx13 commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



vinx13 commented on PR #15839:
URL: https://github.com/apache/tvm/pull/15839#issuecomment-1739959284

   I didn't consider `R.builtin.alloc_tensor` previously because it can be 
either dynamic allocation or memory visible to external (tensors for 
input/output). The implication here is running the model with cuda graph 
enabled will overwrite the output tensors of previous runs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] masahi commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



masahi commented on PR #15839:
URL: https://github.com/apache/tvm/pull/15839#issuecomment-1739872085

   cc @vinx13 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[tvm] branch unity updated: [Unity] Support Padding Reversal in Alter-Op pass (#15679)

2023-09-28 Thread masahi

This is an automated email from the ASF dual-hosted git repository.

masahi pushed a commit to branch unity
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/unity by this push:
 new 225d067fb8 [Unity] Support Padding Reversal in Alter-Op pass (#15679)
225d067fb8 is described below

commit 225d067fb85e37723bbb3a5ca51f872c0e0f6abf
Author: rutkoor <120498024+rutk...@users.noreply.github.com>
AuthorDate: Thu Sep 28 23:19:01 2023 +0530

[Unity] Support Padding Reversal in Alter-Op pass (#15679)

* Support for padding Reversal in Alter-op pass

* Removing lambda from remove_pad te.compute

* Applying clang-format on .cc file

* Removing extra line from manipulate.h file
---
 .../tvm/relax/transform/legalize_ops/manipulate.py | 10 ++-
 python/tvm/relax/transform/transform.py|  2 +-
 src/relax/transform/alter_op_impl.cc   | 81 +++---
 src/te/operation/compute_op.cc |  6 +-
 tests/python/relax/test_transform_alter_op_impl.py | 80 +
 5 files changed, 153 insertions(+), 26 deletions(-)

diff --git a/python/tvm/relax/transform/legalize_ops/manipulate.py 
b/python/tvm/relax/transform/legalize_ops/manipulate.py
index 4e06a0df39..e56240dc0d 100644
--- a/python/tvm/relax/transform/legalize_ops/manipulate.py
+++ b/python/tvm/relax/transform/legalize_ops/manipulate.py
@@ -182,7 +182,15 @@ def _layout_transform(bb: BlockBuilder, call: Call) -> 
Expr:
 )
 
 index_map: tvm.tir.IndexMap = call.attrs.index_map
-pad_value = call.attrs.pad_value.value
+pad_value = call.attrs.pad_value
+if pad_value is not None:
+pad_value = pad_value.value
+else:
+if "int" in call.args[0].struct_info.dtype:
+pad_value = int(0)
+else:
+pad_value = float(0.0)
+
 axis_separators: tvm.tir.IndexMap.AXIS_SEPARATOR = 
call.attrs.axis_separators
 # Convert to list from array
 axis_separators = list(map(lambda x: x.value, axis_separators))
diff --git a/python/tvm/relax/transform/transform.py 
b/python/tvm/relax/transform/transform.py
index 2a06d5098e..72a9966a4b 100644
--- a/python/tvm/relax/transform/transform.py
+++ b/python/tvm/relax/transform/transform.py
@@ -1034,7 +1034,7 @@ def AlterOpImpl(
 l = []
 for transform in transform_list:
 if isinstance(transform, Callable):
-transform = IndexMap.from_func(transform)
+transform = IndexMap.from_func_with_separators(transform)[0]
 l.append(transform)
 op_buffer_transforms[operator_name] = l
 
diff --git a/src/relax/transform/alter_op_impl.cc 
b/src/relax/transform/alter_op_impl.cc
index 9813c4ed24..98d64dd7a8 100644
--- a/src/relax/transform/alter_op_impl.cc
+++ b/src/relax/transform/alter_op_impl.cc
@@ -30,7 +30,11 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
+
+#include "../../te/operation/create_primfunc.h"
 namespace tvm {
 namespace relax {
 
@@ -162,8 +166,18 @@ class AlterOpImplMutator : public ExprMutator {
 return arr_tensor_sinfo;
   }
 
+  bool IsScalarConstant(const Expr& expr) {
+if (expr->IsInstance() && 
expr.as()->is_scalar()) {
+  return true;
+}
+return false;
+  }
+
   Expr TransformLayout(const Expr& expr, const IndexMap& index_map,
-   const Array axis_separators) {
+   const Array& axis_separators) {
+if (IsScalarConstant(expr) || index_map.get() == nullptr) {
+  return expr;
+}
 ObjectPtr attrs = 
make_object();
 // We want to avoid two layout_transform ops to share the same index map 
even if they are
 // identical. The scope of vars used in index map initial indices is local 
to the op. Not doing
@@ -173,19 +187,70 @@ class AlterOpImplMutator : public ExprMutator {
 return Call(layout_transform_op_, {expr}, Attrs{std::move(attrs)}, {});
   }
 
+  /*!
+   * \brief Adds the \p remove_pad op to the module if it has not already been 
added before.
+   * \returns The global var associated with the remove_pad PrimFunc.
+   */
+  GlobalVar GetOrCreateRemovePadOp(const Array& old_shape, const 
DataType& dtype) {
+int t_shape = old_shape.size();
+if (remove_pad_map_.count(t_shape) != 0) {
+  return remove_pad_map_[t_shape];
+}
+// Create dynamic shapes for input and output tensors
+Array dyn_padded_shape, dyn_old_shape;
+for (int i = 0; i < t_shape; i++) {
+  tir::Var var1("p" + std::to_string(i), old_shape[i].dtype());
+  tir::Var var2("i" + std::to_string(i), old_shape[i].dtype());
+  dyn_padded_shape.push_back(var1);
+  dyn_old_shape.push_back(var2);
+}
+
+// Input tensor of remove_pad op
+te::Tensor placeholder_tensor = te::placeholder(dyn_padded_shape, dtype, 
"input");
+// Output tensor of remove_pad op
+te::Tensor output_tensor = te::compute(
+dyn_old_shape,
+

[GitHub] [tvm] masahi merged pull request #15679: [Unity] Support Padding Reversal in Alter-Op pass

2023-09-28 Thread via GitHub



masahi merged PR #15679:
URL: https://github.com/apache/tvm/pull/15679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] Lunderberg commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



Lunderberg commented on PR #15839:
URL: https://github.com/apache/tvm/pull/15839#issuecomment-1739748683

   As part of this implementation, moving the `R.memory.kill_tensor` and 
`R.memory.kill_storage` to be generated in a later `KillAfterLastUse` pass 
meant that the `RemoveCUDAGraph` pass did not need to track these calls.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] Lunderberg opened a new pull request, #15839: [Unity] Include last kernel launch in captured CudaGraph

2023-09-28 Thread via GitHub



Lunderberg opened a new pull request, #15839:
URL: https://github.com/apache/tvm/pull/15839

   Prior to this commit, the last kernel launch would not be included in
   a captured CUDA graph.  This commit updates `RewriteCUDAGraph` to
   include the last kernel launch.
   
   The previous implementation assumed that any calls to
   `R.builtin.alloc_tensor` that remain after `StaticPlanBlockMemory` are
   dynamic allocations.  This is not the case, as the allocation of a
   static-shaped output tensor may still use `R.builtin.alloc_tensor`.
   The primary change of this commit was to update `RewriteCUDAGraph` to
   check for static allocations directly, rather than inferring a static
   allocation based on the operation being used.
   
   This change triggered an additional bug, in which the previous
   implementation only checked for output variables if they occurred as
   part of a `VarBinding`, and not if they occurred as the body of a
   `SeqExpr`.  As a result, a captured CUDA graph whose output was
   immediately used as the output of the containing Relax function would
   contain an undefined variable.  This commit updates `RewriteCUDAGraph`
   to operate on a `SeqExpr` rather than a `BindingBlock`, so that the
   `SeqExprNode::body` may be inspected for output variables.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] tlopex commented on issue #15148: [Tracking Issue] TFLite operator support

2023-09-28 Thread via GitHub



tlopex commented on issue #15148:
URL: https://github.com/apache/tvm/issues/15148#issuecomment-1739729430

   @p3achyjr 
   ```
   # with respect to its fp32 input range, defined in fake_quant.
   # s = 255/(fmax-fmin);  m = -fmin*s (the zero point)
   for i in input_arrays:
   try:
   quant_scale = 255 / (input_range[i][1] - 
input_range[i][0])
   except ZeroDivisionError:
   print("Min and max of the input range for tensor " + i + 
" can't be equal")
   mean = -input_range[i][0] * quant_scale
   input_stats[i] = (mean, quant_scale)
   ```
   Here, if the range is symmetrical, the zero point will always be 128.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] tlopex commented on issue #15148: [Tracking Issue] TFLite operator support

2023-09-28 Thread via GitHub



tlopex commented on issue #15148:
URL: https://github.com/apache/tvm/issues/15148#issuecomment-1739727222

   @p3achyjr  
   ```# calculate the mean and quantization scale for every input 
tensor,
   # with respect to its fp32 input range, defined in fake_quant.
   # s = 255/(fmax-fmin);  m = -fmin*s (the zero point)
   for i in input_arrays:
   try:
   quant_scale = 255 / (input_range[i][1] - 
input_range[i][0])
   except ZeroDivisionError:
   print("Min and max of the input range for tensor " + i + 
" can't be equal")
   mean = -input_range[i][0] * quant_scale
   input_stats[i] = (mean, quant_scale)```
   Here, if the range is symmetrical, the zero point will always be 128.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

2023-09-28 Thread via GitHub



srkreddy1238 commented on code in PR #15833:
URL: https://github.com/apache/tvm/pull/15833#discussion_r1340435814


##
include/tvm/runtime/memory/memory_manager.h:
##
@@ -37,15 +37,15 @@
 
 namespace tvm {
 namespace runtime {
-namespace vm {
+namespace memory {
 
 struct Buffer {
   /*! \brief The pointer to the allocated block of memory. */
   void* data{nullptr};
   /*! \brief The size of the block. */
   size_t size{0};
   /*! \brief The shape of the tensor. */
-  std::vector shape;
+  ShapeTuple shape;

Review Comment:
   Agreed. Even the two stage memory allocation for OpenCL texture doesn't need 
it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] yongwww commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

2023-09-28 Thread via GitHub



yongwww commented on code in PR #15833:
URL: https://github.com/apache/tvm/pull/15833#discussion_r1340399667


##
include/tvm/runtime/memory/memory_manager.h:
##
@@ -37,15 +37,15 @@
 
 namespace tvm {
 namespace runtime {
-namespace vm {
+namespace memory {
 
 struct Buffer {
   /*! \brief The pointer to the allocated block of memory. */
   void* data{nullptr};
   /*! \brief The size of the block. */
   size_t size{0};
   /*! \brief The shape of the tensor. */
-  std::vector shape;
+  ShapeTuple shape;

Review Comment:
   we can remove this member `ShapeTuple shape;`, and related 
`buf.shape.push_back(shape[i]);` in naive_allocator.h. Looks it is not used.



##
src/runtime/memory/memory_manager.cc:
##
@@ -154,29 +154,30 @@ Allocator* MemoryManager::GetAllocator(Device dev) {
   return it->second.get();
 }
 
-NDArray Allocator::Empty(std::vector shape, DLDataType dtype, 
DLDevice dev) {
+NDArray Allocator::Empty(ShapeTuple shape, DLDataType dtype, DLDevice dev,
+ Optional mem_scope) {
   VerifyDataType(dtype);
   NDArray::Container* container = new NDArray::Container(nullptr, shape, 
dtype, dev);
   container->SetDeleter(BufferDeleter);
-  size_t size = GetDataSize(container->dl_tensor);
+  size_t size = DeviceAPI::Get(dev)->GetDataSize(container->dl_tensor);
   size_t alignment = GetDataAlignment(container->dl_tensor);
   Buffer* buffer = new Buffer;
-  *buffer = this->Alloc(size, alignment, dtype);
+  if (!mem_scope.defined() || mem_scope == "global") {
+*buffer = this->Alloc(size, alignment, dtype);
+  } else {
+*buffer = this->Alloc(shape, dtype, mem_scope.value());
+  }
   container->manager_ctx = reinterpret_cast(buffer);
   container->dl_tensor.data = buffer->data;
   return NDArray(GetObjectPtr(container));
 }
 

Review Comment:
   it would be helpful to add the definition of Storage as below in in 
memory_manager.cc, it was not defined yet.
   
   ```
   Storage::Storage(Buffer buffer) {
 auto n = make_object();
 n->buffer = std::move(buffer);
 data_ = std::move(n);
   }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] Lunderberg merged pull request #15810: [Unity] Implement relax.transform.KillAfterLastUse

2023-09-28 Thread via GitHub



Lunderberg merged PR #15810:
URL: https://github.com/apache/tvm/pull/15810


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[tvm] branch unity updated: [Unity] Implement relax.transform.KillAfterLastUse (#15810)

2023-09-28 Thread lunderberg

This is an automated email from the ASF dual-hosted git repository.

lunderberg pushed a commit to branch unity
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/unity by this push:
 new aa4587feb5 [Unity] Implement relax.transform.KillAfterLastUse (#15810)
aa4587feb5 is described below

commit aa4587feb5103927d95e5e931149debd0a0aeafc
Author: Eric Lunderberg 
AuthorDate: Thu Sep 28 08:54:22 2023 -0500

[Unity] Implement relax.transform.KillAfterLastUse (#15810)

* [Unity][Util] Expose CanonicalizeBindings as internal utility

* [Unity] Implement relax.transform.KillAfterLastUse

Prior to this commit, intermediate objects produced while executing a
Relax function would persist until the end of the Relax function.
While re-use of static allocations is handled by the
`StaticPlanBlockMemory` transform, re-use of dynamic allocations is
handled by the `relax_vm::PooledAllocator`.  For large Relax functions
representing end-to-end model execution, releasing memory from the VM
registers to the `relax_vm::PooledAllocator` at the end of the
function call may be insufficient.

This commit introduces a new pass, `relax.transform.KillAfterLastUse`,
which identifies the last usage of each Relax variable and inserts a
`relax.memory.kill_tensor`, `relax.memory.kill_storage`, or
`relax.vm.kill_object` call depending on the object type.  This
insertion is suppressed if a Relax variables is already killed, such
as static allocations and tensors tracked by `StaticPlanBlockMemory`.

* Avoid calling R.vm.kill_object on objects not in registers
---
 python/tvm/relax/transform/transform.py|  10 +
 python/tvm/relax/vm_build.py   |   1 +
 src/relax/transform/kill_after_last_use.cc | 289 +
 src/relax/transform/utils.h|  13 ++
 tests/python/relax/test_kill_after_last_use.py |  55 +
 5 files changed, 368 insertions(+)

diff --git a/python/tvm/relax/transform/transform.py 
b/python/tvm/relax/transform/transform.py
index 13874aa044..2a06d5098e 100644
--- a/python/tvm/relax/transform/transform.py
+++ b/python/tvm/relax/transform/transform.py
@@ -368,6 +368,16 @@ def StaticPlanBlockMemory() -> tvm.ir.transform.Pass:
 return _ffi_api.StaticPlanBlockMemory()  # type: ignore
 
 
+def KillAfterLastUse() -> tvm.ir.transform.Pass:
+"""Drop all tensor/storage objects after last use
+
+Returns
+---
+ret : tvm.ir.transform.Pass
+"""
+return _ffi_api.KillAfterLastUse()  # type: ignore
+
+
 def VMBuiltinLower() -> tvm.ir.transform.Pass:
 """Lowering generic intrinsic to VM intrinsics.
 
diff --git a/python/tvm/relax/vm_build.py b/python/tvm/relax/vm_build.py
index d8679522db..142da5c451 100644
--- a/python/tvm/relax/vm_build.py
+++ b/python/tvm/relax/vm_build.py
@@ -310,6 +310,7 @@ def build(
 passes.append(relax.transform.RemovePurityChecking())
 passes.append(relax.transform.CallTIRRewrite())
 passes.append(relax.transform.StaticPlanBlockMemory())
+passes.append(relax.transform.KillAfterLastUse())
 
 if 
tvm.transform.PassContext.current().config.get("relax.backend.use_cuda_graph", 
False):
 passes.append(relax.transform.RewriteCUDAGraph())
diff --git a/src/relax/transform/kill_after_last_use.cc 
b/src/relax/transform/kill_after_last_use.cc
new file mode 100644
index 00..0f28c6c2b9
--- /dev/null
+++ b/src/relax/transform/kill_after_last_use.cc
@@ -0,0 +1,289 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+/*!
+ * \file src/relax/transform/kill_after_last_use.cc
+ * \brief Kill storage/tensor objects after last use, if not already killed
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "utils.h"
+
+namespace tvm {
+namespace relax {
+
+class UnusedTrivialBindingRemover : public ExprMutator {
+ public:
+  static Expr Apply(Expr expr) {
+struct UsedCollector : ExprVisitor {
+  void VisitExpr_(const VarNode* val) override { used.insert(val); }
+  void VisitExpr_(const DataflowVarNode* val) override {
+

[GitHub] [tvm] Lunderberg commented on issue #15148: [Tracking Issue] TFLite operator support

2023-09-28 Thread via GitHub



Lunderberg commented on issue #15148:
URL: https://github.com/apache/tvm/issues/15148#issuecomment-1739257241

   > I see what you're saying--maybe we can add min/max overrides for 
`_test_forward_elemwise_quantized`.
   
   That's what I'm thinking as well.  It looks like it currently uses the same 
range for both quantization and for data generation.  I think it will need to 
override the data generation range to exclude zero from the denominator, but to 
keep zero in the quantization range as zero may occur in the output.
   
   > I'm surprised that `div` and `floor_div` aren't failing in this case 
though, since the rhs can generate 0s :/.
   
   Agreed, as I would expect the same problem to effect any operator with a 
restricted domain.  My guess is that there's some optimization that assumes the 
inputs to be valid (a legal assumption, as the output is typically undefined 
when the denominator is zero), and that that optimization is affecting floormod 
differently from floordiv.  It probably would be good to track that 
optimization down at some point, if it occurs at the TVM level, but I don't 
think that should delay the re-enabling of the unit test.
   
   > May I ask how you're running these tests multiple times?
   
   It's a bit of a hacky way to do so.  I commented out everything in 
`test_all_elemwise` except for the 
`_test_forward_elemwise_quantized(_test_floor_mod)` line, then added a 
parametrized pytest fixture to the file.  When running pytest as usual 
(`python3 -mpytest -sv 
tests/python/frontend/tflite/test_forward.py::test_all_elemwise`), it then 
repeats every test the number of times specified.
   
   ```python
   import pytest
   
   @pytest.fixture(params=list(range(100)), autouse=True)
   def repeat_all_tests(request):
   return request.param
   ```
   
   I suppose I could have just made a for loop, but I was lazy and this let me 
use pytests pass/fail counter instead of making my own :P.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 commented on pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



srkreddy1238 commented on PR #15818:
URL: https://github.com/apache/tvm/pull/15818#issuecomment-1739147760

   Never mind. Wile I was relooking into the code I found a reference the 
deleted object and fixed it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] echuraev commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



echuraev commented on code in PR #15818:
URL: https://github.com/apache/tvm/pull/15818#discussion_r1339925369


##
apps/cpp_rtvm/main.cc:
##
@@ -350,43 +350,46 @@ int ExecuteModel(ToolArgs& args) {
   } else if (!args.input.empty() && !args.output.empty()) {
 LOG(INFO) << "Executing with Input:" << args.input << " Output:" << 
args.output;
 // Set Input from Numpy Input
-runner->SetInput(args.input);
+runner.SetInput(args.input);
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output as Numpy dump
-runner->GetOutput(args.output);
+runner.GetOutput(args.output);
   } else {
 LOG(INFO) << "Executing dry run ... ";
 // Set random input for all inputs
 for (auto& elem : mInfo.input_info) {
   LOG(INFO) << "Set Random Input for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetInputMemSize(elem.first);
+  size_t ssize = runner.GetInputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
   LOG(INFO) << "Random Input Size:" << ssize << "  bytes";
-  runner->SetInput(elem.first, data);
+  runner.SetInput(elem.first, data);
   free(data);
 }
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output and dump few values
 for (auto& elem : mInfo.output_info) {
   LOG(INFO) << "Get Output for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetOutputMemSize(elem.first);
+  size_t ssize = runner.GetOutputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
-  runner->GetOutput(elem.first, data);
+  runner.GetOutput(elem.first, data);
   LOG(INFO) << "Output Size:" << ssize << "  bytes";
   free(data);
 }
   }
+
+  if (args.profile) {
+// Print Stats
+runner.PrintStats();
+  }
   auto tstart = std::chrono::high_resolution_clock::now();
-  delete runner;
+  runner.~TVMRunner();

Review Comment:
   Ah.. My bad, I missed it during reviewing... Sorry. Probably in this case 
dynamic allocation should be more preferable in terms of readability and 
syntax. So we can leave it as it is or change back to the dynamic allocation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 commented on pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



srkreddy1238 commented on PR #15818:
URL: https://github.com/apache/tvm/pull/15818#issuecomment-1738870461

   @echuraev There are some requirements and concerns on high load and unload 
timings with TVM in general. With PreCompiled bin we could reduce the majority 
and there exist some more scope with params loading. Unload time is also a bit 
significant. I am working on few improvements around these areas ...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



srkreddy1238 commented on code in PR #15818:
URL: https://github.com/apache/tvm/pull/15818#discussion_r1339876548


##
apps/cpp_rtvm/main.cc:
##
@@ -350,43 +350,46 @@ int ExecuteModel(ToolArgs& args) {
   } else if (!args.input.empty() && !args.output.empty()) {
 LOG(INFO) << "Executing with Input:" << args.input << " Output:" << 
args.output;
 // Set Input from Numpy Input
-runner->SetInput(args.input);
+runner.SetInput(args.input);
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output as Numpy dump
-runner->GetOutput(args.output);
+runner.GetOutput(args.output);
   } else {
 LOG(INFO) << "Executing dry run ... ";
 // Set random input for all inputs
 for (auto& elem : mInfo.input_info) {
   LOG(INFO) << "Set Random Input for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetInputMemSize(elem.first);
+  size_t ssize = runner.GetInputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
   LOG(INFO) << "Random Input Size:" << ssize << "  bytes";
-  runner->SetInput(elem.first, data);
+  runner.SetInput(elem.first, data);
   free(data);
 }
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output and dump few values
 for (auto& elem : mInfo.output_info) {
   LOG(INFO) << "Get Output for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetOutputMemSize(elem.first);
+  size_t ssize = runner.GetOutputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
-  runner->GetOutput(elem.first, data);
+  runner.GetOutput(elem.first, data);
   LOG(INFO) << "Output Size:" << ssize << "  bytes";
   free(data);
 }
   }
+
+  if (args.profile) {
+// Print Stats
+runner.PrintStats();
+  }
   auto tstart = std::chrono::high_resolution_clock::now();
-  delete runner;
+  runner.~TVMRunner();

Review Comment:
   We are timing it to measure the unload time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] echuraev merged pull request #15830: [ADRENO] Minor changes for Adreno docs and help scripts

2023-09-28 Thread via GitHub



echuraev merged PR #15830:
URL: https://github.com/apache/tvm/pull/15830


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[tvm] branch main updated: [ADRENO] Minor changes for Adreno docs and help scripts (#15830)

2023-09-28 Thread echuraev

This is an automated email from the ASF dual-hosted git repository.

echuraev pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git


The following commit(s) were added to refs/heads/main by this push:
 new 9d8e6fda50 [ADRENO] Minor changes for Adreno docs and help scripts 
(#15830)
9d8e6fda50 is described below

commit 9d8e6fda50bce14bf597de1f87711230e6001e4e
Author: Siva 
AuthorDate: Thu Sep 28 15:38:30 2023 +0530

[ADRENO] Minor changes for Adreno docs and help scripts (#15830)

[ADRENO] Minor changes for Adreno docs and help scripts

NCHW is mandatory layout for CLML offload. Updated the docs.
CI scripts will keep OpenCL enbaled as fallback always.
Enable configurable device bind ports. Helps in multi user environments.
---
 docs/how_to/deploy/adreno.rst |  2 +-
 tests/scripts/setup-adreno-env.sh | 18 +-
 tests/scripts/task_config_build_adreno.sh |  2 ++
 3 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/docs/how_to/deploy/adreno.rst b/docs/how_to/deploy/adreno.rst
index ed016a3ff7..f0b8c6f757 100644
--- a/docs/how_to/deploy/adreno.rst
+++ b/docs/how_to/deploy/adreno.rst
@@ -432,7 +432,7 @@ as the OpenCL path is fallback option for any operator 
didn't go through OpenCLM
 
python3 -m tvm.driver.tvmc compile \
--cross-compiler 
${ANDROID_NDK_HOME}/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android28-clang
 \
-   --target="opencl, clml, llvm" --target-llvm-mtriple aarch64-linux-gnu 
--target-opencl-device adreno \
+   --target="opencl, clml, llvm" --desired-layout NCHW --target-llvm-mtriple 
aarch64-linux-gnu --target-opencl-device adreno \
--tuning-records keras-resnet50.log -o keras-resnet50.tar resnet50.h5
 
 On successful compilation, above command produce ``keras-resnet50.tar``.
diff --git a/tests/scripts/setup-adreno-env.sh 
b/tests/scripts/setup-adreno-env.sh
index 55a92c5f61..15c124a0f0 100755
--- a/tests/scripts/setup-adreno-env.sh
+++ b/tests/scripts/setup-adreno-env.sh
@@ -20,12 +20,13 @@
 ENVIRONMENT=""
 RPC_PORT=""
 ADB_SERIAL=""
+LISTEN_PORT=5000
 
 function usage() {
 echo "Helper script to setup the environment for Tracker, RPC Device and 
for application"
 echo "Usage (Help) : source setup-adreno-env.sh -h"
 echo "Usage (Tracker): source setup-adreno-env.sh -e tracker -p "
-echo "Usage (Device): source setup-adreno-env.sh -e device -p  
-d "
+echo "Usage (Device): source setup-adreno-env.sh -e device -p  
-d  [-l ]"
 echo "Usage (Query): source setup-adreno-env.sh -e query -p "
 }
 
@@ -46,6 +47,11 @@ while [[ $# -gt 0 ]]; do
   shift # past argument
   shift # past value
   ;;
+-l|--listen-port)
+  LISTEN_PORT="$2"
+  shift # past argument
+  shift # past value
+  ;;
 -h|--help)
   usage
   return 0
@@ -62,6 +68,7 @@ done
 echo "ENVIRONMENT   = ${ENVIRONMENT}"
 echo "RPC_PORT  = ${RPC_PORT}"
 echo "ADB_SERIAL= ${ADB_SERIAL}"
+echo "DEVICE LISTEN POPRT= ${LISTEN_PORT}"
 
 
 function def_environment() {
@@ -100,10 +107,11 @@ case ${ENVIRONMENT} in
 fi
 
 adb reverse tcp:${TVM_TRACKER_PORT} tcp:${TVM_TRACKER_PORT}
-adb forward tcp:5000 tcp:5000
-adb forward tcp:5001 tcp:5001
-adb forward tcp:5002 tcp:5002
-adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=5000 --port-end=5010 --tracker=127.0.0.1:${TVM_TRACKER_PORT} 
--key=${RPC_DEVICE_KEY}"
+adb forward tcp:${LISTEN_PORT} tcp:${LISTEN_PORT}
+adb forward tcp:$((LISTEN_PORT + 1)) tcp:$((LISTEN_PORT + 1))
+adb forward tcp:$((LISTEN_PORT + 2)) tcp:$((LISTEN_PORT + 2))
+adb forward tcp:$((LISTEN_PORT + 3)) tcp:$((LISTEN_PORT + 3))
+adb shell "cd ${TARGET_FOLDER}; killall -9 tvm_rpc-${USER}; sleep 2; 
LD_LIBRARY_PATH=${TARGET_FOLDER}/ ./tvm_rpc-${USER} server --host=0.0.0.0 
--port=${LISTEN_PORT} --port-end=$((LISTEN_PORT + 10)) 
--tracker=127.0.0.1:${TVM_TRACKER_PORT} --key=${RPC_DEVICE_KEY}"
 ;;
 
   "query")
diff --git a/tests/scripts/task_config_build_adreno.sh 
b/tests/scripts/task_config_build_adreno.sh
index 62e6ffecbc..1b6750f165 100755
--- a/tests/scripts/task_config_build_adreno.sh
+++ b/tests/scripts/task_config_build_adreno.sh
@@ -25,6 +25,8 @@ cp ../cmake/config.cmake .
 
 if [ -f "${ADRENO_OPENCL}/CL/cl_qcom_ml_ops.h" ] ; then
 echo set\(USE_CLML ${ADRENO_OPENCL}\) >> config.cmake
+else
+echo set\(USE_OPENCL ON\) >> config.cmake
 fi
 echo set\(USE_RPC ON\) >> config.cmake
 echo set\(USE_GRAPH_EXECUTOR ON\) >> config.cmake

[GitHub] [tvm] echuraev commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

2023-09-28 Thread via GitHub



echuraev commented on code in PR #15818:
URL: https://github.com/apache/tvm/pull/15818#discussion_r1339863992


##
apps/cpp_rtvm/main.cc:
##
@@ -350,43 +350,46 @@ int ExecuteModel(ToolArgs& args) {
   } else if (!args.input.empty() && !args.output.empty()) {
 LOG(INFO) << "Executing with Input:" << args.input << " Output:" << 
args.output;
 // Set Input from Numpy Input
-runner->SetInput(args.input);
+runner.SetInput(args.input);
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output as Numpy dump
-runner->GetOutput(args.output);
+runner.GetOutput(args.output);
   } else {
 LOG(INFO) << "Executing dry run ... ";
 // Set random input for all inputs
 for (auto& elem : mInfo.input_info) {
   LOG(INFO) << "Set Random Input for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetInputMemSize(elem.first);
+  size_t ssize = runner.GetInputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
   LOG(INFO) << "Random Input Size:" << ssize << "  bytes";
-  runner->SetInput(elem.first, data);
+  runner.SetInput(elem.first, data);
   free(data);
 }
 // Run the model
-runner->Run();
+runner.Run();
 // Get Output and dump few values
 for (auto& elem : mInfo.output_info) {
   LOG(INFO) << "Get Output for :" << elem.first;
   auto shape = elem.second.first;
-  size_t ssize = runner->GetOutputMemSize(elem.first);
+  size_t ssize = runner.GetOutputMemSize(elem.first);
   char* data = (char*)malloc(ssize);
-  runner->GetOutput(elem.first, data);
+  runner.GetOutput(elem.first, data);
   LOG(INFO) << "Output Size:" << ssize << "  bytes";
   free(data);
 }
   }
+
+  if (args.profile) {
+// Print Stats
+runner.PrintStats();
+  }
   auto tstart = std::chrono::high_resolution_clock::now();
-  delete runner;
+  runner.~TVMRunner();

Review Comment:
   Why do you need to call dtor here? Probably, I missed something in the logic.
   I suppose that the object will be automatically destroyed in the end of the 
program.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 opened a new pull request, #15838: [FRONTEND] Fix unnecessary pylint errors

2023-09-28 Thread via GitHub



srkreddy1238 opened a new pull request, #15838:
URL: https://github.com/apache/tvm/pull/15838

   Handle unnecessary pylint errors from these frontends


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

2023-09-28 Thread via GitHub



srkreddy1238 commented on code in PR #15833:
URL: https://github.com/apache/tvm/pull/15833#discussion_r1339827581


##
include/tvm/runtime/memory_manager.h:
##
@@ -130,7 +131,7 @@ class MemoryManager {
 class StorageObj : public Object {
  public:
   /*! \brief The index into the VM function table. */
-  Buffer buffer;
+  MBuffer buffer;
 
   /*! \brief Allocate an NDArray from a given piece of storage. */
   NDArray AllocNDArray(size_t offset, std::vector shape, DLDataType 
dtype);

Review Comment:
   Handled ShapeTuple across memory_manager and Allocators



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] masahi opened a new pull request, #15837: [Unity][BYOC] Support attention over batched sequences of variable sequence lengths

2023-09-28 Thread via GitHub



masahi opened a new pull request, #15837:
URL: https://github.com/apache/tvm/pull/15837

   (no comment)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [tvm] lhutton1 opened a new pull request, #15836: [CI] Update ci-gpu image

2023-09-28 Thread via GitHub



lhutton1 opened a new pull request, #15836:
URL: https://github.com/apache/tvm/pull/15836

   The ci-gpu is updated to make changes in #15819 reflect in CI.
   
   cc @ashutosh-arm 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@tvm.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[tvm] branch nightly updated (cf081d9929 -> def551dfd5)

[tvm] branch main updated: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance (#15818)

[GitHub] [tvm] srkreddy1238 merged pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[tvm] branch main updated: [FRONTEND] Fix unnecessary pylint errors (#15838)

[GitHub] [tvm] srkreddy1238 merged pull request #15838: [FRONTEND] Fix unnecessary pylint errors

[GitHub] [tvm] vinx13 commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

[GitHub] [tvm] Lunderberg commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

[GitHub] [tvm] zxybazh commented on a diff in pull request #15823: [Unity] Add Relax multi-device e2e cases

[GitHub] [tvm] yongwww commented on pull request #15772: [Unity][QNN] Add QNN quantize/dequantize ops

[GitHub] [tvm] yongwww commented on pull request #15686: [Unity] Add new Relax annotation ops: smooth and absmax

[GitHub] [tvm] vinx13 commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

[GitHub] [tvm] masahi commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

[tvm] branch unity updated: [Unity] Support Padding Reversal in Alter-Op pass (#15679)

[GitHub] [tvm] masahi merged pull request #15679: [Unity] Support Padding Reversal in Alter-Op pass

[GitHub] [tvm] Lunderberg commented on pull request #15839: [Unity] Include last kernel launch in captured CudaGraph

[GitHub] [tvm] Lunderberg opened a new pull request, #15839: [Unity] Include last kernel launch in captured CudaGraph

[GitHub] [tvm] tlopex commented on issue #15148: [Tracking Issue] TFLite operator support

[GitHub] [tvm] tlopex commented on issue #15148: [Tracking Issue] TFLite operator support

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

[GitHub] [tvm] yongwww commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

[GitHub] [tvm] Lunderberg merged pull request #15810: [Unity] Implement relax.transform.KillAfterLastUse

[tvm] branch unity updated: [Unity] Implement relax.transform.KillAfterLastUse (#15810)

[GitHub] [tvm] Lunderberg commented on issue #15148: [Tracking Issue] TFLite operator support

[GitHub] [tvm] srkreddy1238 commented on pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[GitHub] [tvm] echuraev commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[GitHub] [tvm] srkreddy1238 commented on pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[GitHub] [tvm] echuraev merged pull request #15830: [ADRENO] Minor changes for Adreno docs and help scripts

[tvm] branch main updated: [ADRENO] Minor changes for Adreno docs and help scripts (#15830)

[GitHub] [tvm] echuraev commented on a diff in pull request #15818: [CLI TOOLS][RTVM] Improve rtvm tool with new options to measure native performance

[GitHub] [tvm] srkreddy1238 opened a new pull request, #15838: [FRONTEND] Fix unnecessary pylint errors

[GitHub] [tvm] srkreddy1238 commented on a diff in pull request #15833: [VM] Memory Manager moved up to runtime

[GitHub] [tvm] masahi opened a new pull request, #15837: [Unity][BYOC] Support attention over batched sequences of variable sequence lengths

[GitHub] [tvm] lhutton1 opened a new pull request, #15836: [CI] Update ci-gpu image

34 matches

Site Navigation

Mail list logo

Footer information