masahi commented on code in PR #11911:
URL: https://github.com/apache/tvm/pull/11911#discussion_r919645242


##########
python/tvm/contrib/torch/optimize_torch.py:
##########
@@ -0,0 +1,143 @@
+# pylint: disable=inconsistent-return-statements
+#!/usr/bin/env python
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+# pylint: disable=missing-module-docstring
+# pylint: disable=missing-class-docstring
+# pylint: disable=missing-function-docstring
+"""
+optimize_torch: aa function similar to `torch.jit.trace`,
+which is used to optimize the `torch.nn.module` by TVM metaSchedule,
+and returns a custom TorchScript operator
+"""
+import base64
+import contextlib
+import tempfile
+from typing import Tuple
+
+import torch
+import torch.utils.dlpack
+
+import tvm
+from tvm import relay
+from tvm._ffi import get_global_func, register_func
+from tvm.meta_schedule import TuneConfig
+from tvm.meta_schedule.tune import tune_relay
+
+
+# The python wrapper for GraphExecutorFactory
+class GraphExecutorFactoryWrapper(torch.nn.Module):
+    def __init__(self, module: tvm.runtime.Module):
+        super().__init__()
+        self.inner_module = module
+
+    def forward(self, *torch_inputs: Tuple[torch.Tensor]):
+        ret = self.inner_module.forward(torch_inputs)
+        if len(ret) == 1:
+            return ret[0]
+        return ret
+
+
+def llvm_target():
+    return "llvm -num-cores"
+
+
+@register_func("script_torch.save_to_base64")
+def save_to_base64(obj) -> bytes:
+    with tempfile.NamedTemporaryFile(suffix=".so") as tmpfile:
+        obj.export_library(tmpfile.name)
+        with open(tmpfile.name, "rb") as tfile:
+            return base64.b64encode(tfile.read())
+
+
+def optimize_torch(
+    func,
+    example_inputs,
+    tuning_config=None,
+    target=None,
+    work_dir=None,
+):
+    """Load PyTorch model that could be traced by TorchScript, then optimize 
it via MetaSchedule.
+
+    Parameters
+    ----------
+    func : callable or torch.nn.Module
+        A Python function or nn.Module that could run by TorchScript's trace.
+        (ie: torch.jit.trace(model, input))
+
+    example_inputs : tuple or torch.Tensor
+        A tuple of example inputs that
+        will run together with `func` by providing the shape information.
+
+    tuning_config : tvm.meta_schedule.TuneConfig
+        The configuration of tuning by MetaSchedule.
+        We suggest users to provide their own setting,
+        otherwise by default setting a tuning process could be very slow,
+        sometimes costs a few hours.
+
+    target : Optional[Union[str, Target]]
+        The target of the compilation.
+        If user doesn't set the target, the module is built upon the LLVM.
+
+    work_dir : Optional[str]
+        The working directory to save intermediate results.
+
+    Returns
+    -------
+    mod : GraphExecutorFactoryWrapper
+        It will return an object of GraphExecutorFactoryWrapper,
+        which is the subclass of the original nn.Module.
+    """
+
+    if target:
+        pass
+    else:
+        target = llvm_target()
+
+    if tuning_config:
+        pass
+    else:
+        # Default setting. For a better tuning result the number could be set 
large.
+        tuning_config = TuneConfig(
+            strategy="evolutionary",
+            num_trials_per_iter=64,
+            max_trials_per_task=2000,
+            max_trials_global=2000,
+        )

Review Comment:
   What do you mean by "common practice"? We are talking about user experience 
of PyTorch users who are new to TVM, rather than a TVM developer who regularly 
tunes on a large cloud instance like Xiyou.
   
   These parameters depend on (1) the type of workload (single op vs a big e2e 
models, the size of tuning space for each op etc) (2) CPU vs GPU target. Since 
different use cases require very different tuning time, I see no justification 
of 64/2k/2k being good default params for all cases. 
   
   If a PyTorch user gives us a single matmul op and asked to tune on CPU, this 
would very likely lead to very long wasteful tuning time. And since 
`optimize_torch` needs to return the tuned model, a user cannot abort the 
tuning in the middle, even if they find that the tuning is not making any 
progress. On the other hand, if a user asks us to tune a huge e2d model like 
MaskRCNN on GPU, it's possible that these params are not sufficient for optimal 
performance. 
   
   I get that selection of the default param is a highly tricky problem - My 
opinion is, if there is no good strategy to come up with a reasonable param 
based on user inputs, it's better to ask users to give a desired tuning config, 
rather than us selecting an ad hoc param behind the scene and users ending up 
complaining about slow tuning time or suboptimal performance.   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to