jtuyls commented on a change in pull request #9425: URL: https://github.com/apache/tvm/pull/9425#discussion_r744108401
########## File path: docker/install/ubuntu_install_rocm.sh ########## @@ -21,10 +21,10 @@ set -u set -o pipefail # Install ROCm cross compilation toolchain. -wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key add - -echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > /etc/apt/sources.list.d/rocm.list +wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add - +echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list Review comment: Then the TVM build is fine but I am seeing following issue when running an example because we have LLVM 12 in TVM: ``` E TVMError: Fail to load bitcode file /opt/rocm/amdgcn/bitcode/hc.bc E line -1:Invalid record (Producer: 'LLVM13.0.0git' Reader: 'LLVM 12.0.1') ``` I went back and tried a bunch versions of ROCM and LLVM (upstream, not the one included in ROCM) and this is what I got when running an example with every combination: ``` ROCM 4.3 + lld-9 + llvm-config-9 -> LLVM ERROR: Unknown specifier in datalayout string + lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout string + lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout string + lld-12 + llvm-config-12 -> TVMError: Fail to load bitcode file /opt/rocm/amdgcn/bitcode/hc.bc line -1:Invalid record (Producer: 'LLVM13.0.0git' Reader: 'LLVM 12.0.1') ROCM 4.2 + lld-9 + llvm-config-9 -> LLVM ERROR: Unknown specifier in datalayout string + lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout string + lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout string + lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: hipErrorSharedObjectInitFailed ROCM 4.1 + lld-9 + llvm-config-9 -> Works + lld-10 + llvm-config-10 -> Works + lld-11 + llvm-config-11 -> Works + lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: hipErrorSharedObjectInitFaile ROCM 4.0 + lld-9 + llvm-config-9 -> Works + lld-10 + llvm-config-10 -> Works + lld-11 + llvm-config-11 -> Works + lld-12 + llvm-config-12 -> Works ``` It looks like the last version of ROCM that works across the board is v4.0. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
