jtuyls commented on a change in pull request #9425:
URL: https://github.com/apache/tvm/pull/9425#discussion_r744108401



##########
File path: docker/install/ubuntu_install_rocm.sh
##########
@@ -21,10 +21,10 @@ set -u
 set -o pipefail
 
 # Install ROCm cross compilation toolchain.
-wget -qO - http://repo.radeon.com/rocm/apt/debian/rocm.gpg.key | sudo apt-key 
add -
-echo deb [arch=amd64] http://repo.radeon.com/rocm/apt/debian/ xenial main > 
/etc/apt/sources.list.d/rocm.list
+wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
+echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/4.3/ ubuntu main' | 
sudo tee /etc/apt/sources.list.d/rocm.list

Review comment:
       Then the TVM build is fine but I am seeing following issue when running 
an example because we have LLVM 12 in TVM:
   ```
   E           TVMError: Fail to load bitcode file 
/opt/rocm/amdgcn/bitcode/hc.bc
   E           line -1:Invalid record (Producer: 'LLVM13.0.0git' Reader: 'LLVM 
12.0.1')
   ```
   
   I went back and tried a bunch versions of ROCM and LLVM (upstream, not the 
one included in ROCM) and this is what I got when running an example with every 
combination: 
   
   ```
   ROCM 4.3
   + lld-9 + llvm-config-9   -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-12 + llvm-config-12 -> TVMError: Fail to load bitcode file 
/opt/rocm/amdgcn/bitcode/hc.bc    line -1:Invalid record (Producer: 
'LLVM13.0.0git' Reader: 'LLVM 12.0.1')
   
   ROCM 4.2
   + lld-9 + llvm-config-9   -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-10 + llvm-config-10 -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-11 + llvm-config-11 -> LLVM ERROR: Unknown specifier in datalayout 
string
   + lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: 
ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed 
with error: hipErrorSharedObjectInitFailed
   
   ROCM 4.1
   + lld-9 + llvm-config-9   -> Works
   + lld-10 + llvm-config-10 -> Works
   + lld-11 + llvm-config-11 -> Works
   + lld-12 + llvm-config-12 -> Check failed: ret == 0 (-1 vs. 0) : TVMError: 
ROCM HIP Error: hipModuleLoadData(&(module_[device_id]), data_.c_str()) failed 
with error: hipErrorSharedObjectInitFaile
   
   
   ROCM 4.0
   + lld-9 + llvm-config-9   -> Works
   + lld-10 + llvm-config-10 -> Works
   + lld-11 + llvm-config-11 -> Works
   + lld-12 + llvm-config-12 -> Works
   ```
   
   It looks like the last version of ROCM that works across the board is v4.0.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to