YongtaoHuang1994 opened a new issue #8286:
URL: https://github.com/apache/tvm/issues/8286
We have prepared the AMD GPU and its SDK ROCm.
```
$ lspci
$ 02:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI]
Baffin [Radeon RX 550 640SP / RX 560/560X] (rev ff)
$ /opt/rocm/bin/rocminfo | grep Vendor
Vendor Name: CPU
Vendor Name: AMD
```
Then we turn on "set(USE_LLVM llvm-config-9)" and "set(USE_ROCM on)" and
build TVM with ROCm successfully. Next, we set the env variable:
```
$ export TVM_HOME=/home/hyongtao/tvm_src/tvm
$ export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
```
After all the dependence is ready, we test resnet in TVM with ROCm. The
inference code is shown as followed:
```
import os
import numpy as np
import tvm
from tvm import relay, autotvm
import tvm.relay.testing
from tvm.autotvm.tuner import XGBTuner, GATuner, RandomTuner, GridSearchTuner
import tvm.contrib.graph_executor as runtime
def get_network(name, batch_size):
"""Get the symbol definition and random weight of a network"""
input_shape = (batch_size, 3, 224, 224)
output_shape = (batch_size, 1000)
if "resnet" in name:
n_layer = int(name.split("-")[1])
mod, params = relay.testing.resnet.get_workload(
num_layers=n_layer, batch_size=batch_size, dtype=dtype
)
else:
raise ValueError("Unsupported network: " + name)
return mod, params, input_shape, output_shape
target = tvm.target.aocm()
network = "resnet-18"
log_file = "%s.log" % network
dtype = "float32"
def tune_and_evaluate():
# extract workloads from relay program
print("Extract tasks...")
mod, params, input_shape, out_shape = get_network(network, batch_size=1)
with tvm.transform.PassContext(opt_level=3):
lib = relay.build_module.build(mod, target=target, params=params)
# load parameters
dev = tvm.device(str(target), 0)
module = runtime.GraphModule(lib["default"](dev))
data_tvm =
tvm.nd.array((np.random.uniform(size=input_shape)).astype(dtype))
module.set_input("data", data_tvm)
# evaluate
print("Evaluate inference time cost...")
ftimer = module.module.time_evaluator("run", dev, number=1, repeat=600)
prof_res = np.array(ftimer().results) * 1000 # convert to millisecond
print(
"Mean inference time (std dev): %.2f ms (%.2f ms)"
% (np.mean(prof_res), np.std(prof_res))
)
tune_and_evaluate()
```
Then we get these errors:
```
[16:00:20] /home/hyongtao/tvm_src/tvm/src/target/target_kind.cc:182:
Warning: Unable to detect ROCm compute arch, default to "-mcpu=gfx900" instead
[16:00:20] /home/hyongtao/tvm_src/tvm/src/target/target_kind.cc:196:
Warning: Unable to detect ROCm version, assuming >= 3.5
[16:00:20] /home/hyongtao/tvm_src/tvm/src/target/target_kind.cc:196:
Warning: Unable to detect ROCm version, assuming >= 3.5
One or more operators have not been tuned. Please tune your model for better
performance. Use DEBUG logging level to see more details.
[16:00:27] /home/hyongtao/tvm_src/tvm/src/target/target_kind.cc:196:
Warning: Unable to detect ROCm version, assuming >= 3.5
[16:00:27] /home/hyongtao/tvm_src/tvm/src/target/llvm/codegen_amdgpu.cc:54:
Warning: Cannot get maximum number of threads for AMD codegen
[16:00:27] /home/hyongtao/tvm_src/tvm/src/target/llvm/codegen_amdgpu.cc:54:
Warning: Cannot get maximum number of threads for AMD codegen
LLVM ERROR: Unknown specifier in datalayout string
(tvm-build) root@hyongtao-Precision-Tower-5810:/home/hyongtao/tvm_demo#
```
Could you help me solve this problem? Thanks a lot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]