zhiics commented on a change in pull request #6655:
URL: https://github.com/apache/tvm/pull/6655#discussion_r530535123



##########
File path: src/relay/transforms/annotate_target.cc
##########
@@ -128,14 +143,31 @@ class AnnotateTargetRewriter : public ExprRewriter {
      * \return An annotated and target-propagated relay expression.
      */
     Expr new_expr = expr;
-    if (op_expr_to_target_.find(expr) != op_expr_to_target_.end() && 
FreeVars(expr).size() != 0) {
-      new_expr = InsertAnnotation(expr, op_expr_to_target_[expr], make_end_op);
-      op_expr_to_target_[new_expr] = op_expr_to_target_[expr];
+    const CallNode* call = expr.as<CallNode>();
+    if (op_expr_to_target_.find(expr) != op_expr_to_target_.end()) {
+      // Check whether expr has args, if not - do not insert compiler_end.
+      if (expr->IsInstance<RefWriteNode>() || 
expr->IsInstance<RefCreateNode>() ||
+          expr->IsInstance<RefReadNode>() || expr->IsInstance<TupleNode>() ||

Review comment:
       There would be more nodes, like constructors. But I am still concerned 
if this changed is needed. This really makes this already complicated pass more 
complicated. I still don't see a good point why we don't run 
mergecompilerregions. That would solve this problem. Without running it, we 
would have a large number of small segments, which requires frequent data 
transfer between the host and device as well as frequent kernel launch.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to