Public bug reported:

# What was expected

I installed librocsparse1-tests on in a Resolute docker container, then
I installed the AMD ROCm 7.1.0 packages for /opt/rocm-7.1.0 in a Noble
docker container. I copied the contents of /opt/rocm-7.1.0 in the noble
container into my Resolute container. This provided me with two copies
of ROCm 7.1.0, so that I could mix-and-match libraries.

I ran rocsparse-test using the libamd_comgr3.so from /opt/rocm through
the use of LD_PRELOAD. This resulted in a passing test. The LD_PRELOAD
should not be necessary.

# LD_PRELOAD=/opt/rocm-7.1.0/lib/libamd_comgr.so 
/usr/libexec/rocm/librocsparse1-tests/rocsparse-test 
--gtest_filter='quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0'
Query device success: there are 2 devices
Device ID 0: AMD Radeon RX 9070 XT
-------------------------------------------------------------------------
with 16304MB memory, clock rate 2520MHz @ computing capability 12.0
maxGridDimX 2147483647, sharedMemPerBlock 64KB, maxThreadsPerBlock 1024
wavefrontSize 32
-------------------------------------------------------------------------
Device ID 1: AMD Ryzen 9 7950X 16-Core Processor
-------------------------------------------------------------------------
with 47800MB memory, clock rate 2200MHz @ computing capability 10.3
maxGridDimX 2147483647, sharedMemPerBlock 64KB, maxThreadsPerBlock 1024
wavefrontSize 32
-------------------------------------------------------------------------
Using device ID 0 (AMD Radeon RX 9070 XT) for rocSPARSE
-------------------------------------------------------------------------
rocSPARSE version: 4.1.0-
rocSPARSE data path: /usr/share/librocsparse1-tests/data/
rocsparse-test: debug arguments verbose is disabled for testings (use 
--force-debug-arguments-verbose to skip the disabling)
rocsparse-test: warnings are disabled for testings (use --force-warnings to 
skip the disabling)
Note: Google Test filter = 
quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from quick/axpby
[ RUN      ] quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0
[       OK ] quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0 (39 
ms)
[----------] 1 test from quick/axpby (39 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test suite ran. (46 ms total)
[  PASSED  ] 1 test.
Failed to free ptr:0x7fd9d2e00000 size:2097152

The warning at the end of the output is
https://bugs.launchpad.net/ubuntu/+source/rocr-runtime/+bug/2142805.

# What happened

Using the comgr provided by libamd_comgr3 7.1.0+dfsg-0ubuntu4, there is
instead a fatal missing symbol error when the GPU kernel is invoked:

# /usr/libexec/rocm/librocsparse1-tests/rocsparse-test 
--gtest_filter='quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0'
Query device success: there are 2 devices
Device ID 0: AMD Radeon RX 9070 XT
-------------------------------------------------------------------------
with 16304MB memory, clock rate 2520MHz @ computing capability 12.0
maxGridDimX 2147483647, sharedMemPerBlock 64KB, maxThreadsPerBlock 1024
wavefrontSize 32
-------------------------------------------------------------------------
Device ID 1: AMD Ryzen 9 7950X 16-Core Processor
-------------------------------------------------------------------------
with 47800MB memory, clock rate 2200MHz @ computing capability 10.3
maxGridDimX 2147483647, sharedMemPerBlock 64KB, maxThreadsPerBlock 1024
wavefrontSize 32
-------------------------------------------------------------------------
Using device ID 0 (AMD Radeon RX 9070 XT) for rocSPARSE
-------------------------------------------------------------------------
rocSPARSE version: 4.1.0-
rocSPARSE data path: /usr/share/librocsparse1-tests/data/
rocsparse-test: debug arguments verbose is disabled for testings (use 
--force-debug-arguments-verbose to skip the disabling)
rocsparse-test: warnings are disabled for testings (use --force-warnings to 
skip the disabling)
Note: Google Test filter = 
quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from quick/axpby
[ RUN      ] quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0
:0:./hipamd/src/hip_global.cpp:109 : 195188722870 us:  Cannot find Symbol with 
name: 
_ZN9rocsparseL12axpyi_kernelILj256EfiffEEvT1_NS_24const_host_device_scalarIT0_EEPKT2_PKS1_PT3_21rocsparse_index_base_b
Aborted                    (core dumped) 
/usr/libexec/rocm/librocsparse1-tests/rocsparse-test 
--gtest_filter='quick/axpby.level1/i32_f32_r_f32_r_f32_r_1200_5_1_0_0_0_0b_0'

# ltrace info

I ran the two with ltrace -e 'amd_comgr*' -e 'hip*' -e 'hsa*'. The logs
are attached. I also prepared a diff of the logs after stripping
function arguments from the ltrace output.

# System Info

## lsb_release -rd
Description: Ubuntu Resolute Raccoon (development branch)
Release: 26.04

# apt-cache policy libamd-comgr3
libamd-comgr3:
  Installed: 7.1.0+dfsg-0ubuntu4
  Candidate: 7.1.0+dfsg-0ubuntu4
  Version table:
 *** 7.1.0+dfsg-0ubuntu4 500
        500 http://archive.ubuntu.com/ubuntu resolute/universe amd64 Packages
        100 /var/lib/dpkg/status

** Affects: rocm-llvm (Ubuntu)
     Importance: Undecided
         Status: New

** Attachment added: "diff of passing and failing ltrace logs"
   
https://bugs.launchpad.net/bugs/2142813/+attachment/5948838/+files/rocsparse-test.diff

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2142813

Title:
  missing symbol error loading compressed bundles

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rocm-llvm/+bug/2142813/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to