jwfromm commented on a change in pull request #7063:
URL: https://github.com/apache/tvm/pull/7063#discussion_r539509778
##########
File path: python/tvm/contrib/nvcc.py
##########
@@ -269,15 +270,24 @@ def have_int8(compute_version):
return False
-def have_tensorcore(compute_version):
+def have_tensorcore(compute_version=None):
"""Either TensorCore support is provided in the compute capability or not
Parameters
----------
compute_version : str
compute capability of a GPU (e.g. "7.0")
"""
+ if compute_version is None:
+ if tvm.gpu(0).exist:
+ compute_version = tvm.gpu(0).compute_version
+ else:
+ compute_version = AutotvmGlobalScope.current.cuda_target_arch
Review comment:
I thought about this some more overnight and I'm not sure adding it
PassContext makes any more sense than having it in AutoTVMGlobalScope. We
really should be specifying this information as part of a `tvm.Target` since
the cuda architecture purely describes the hardware target and doesn't really
relate to relay passes or autotvm directly. I think this should be done once we
move further from string based targets to objects like those introduced in
#6218. I'd argue that for now, applying a bandaid fix to Autotvm.GlobalScope as
in the current PR is the best way to temporarily solve the problem. What do you
think @comaniac? I'd also be interested in hearing what @tqchen thinks.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]