tqchen commented on code in PR #14513:
URL: https://github.com/apache/tvm/pull/14513#discussion_r1161740959
##########
python/tvm/relax/vm_build.py:
##########
@@ -297,6 +297,10 @@ def foo(x: Tensor((3, 4), "float32"), y: Tensor((3, 4),
"float32")):
passes.append(relax.transform.ToNonDataflow())
passes.append(relax.transform.CallTIRRewrite())
passes.append(relax.transform.StaticPlanBlockMemory())
+
+ if
tvm.transform.PassContext.current().config.get("relax.backend.use_cuda_graph",
False):
Review Comment:
Perhaps one approach is to lift out the static memory planning and TIR
rewrite phases into another function.
Then cuda graph can call these passes from outside and then pass to build,
which detects if the function have already pass the runtime optimization stage,
and skips the passes
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]