Public bug reported: [Impact]
Resolute (26.04 LTS) currently ships rocm-llvm 7.1.1+dfsg-0ubuntu1, which predates the ROCm 7.2.x point releases. Users on resolute therefore lack: * device-libs and comgr support for hardware enabled in ROCm 7.2.x (notably the GFX12.5 cluster intrinsics and other newer AMDGPU subtarget features), which means downstream ROCm components built against this rocm-llvm cannot generate or load code for these targets. * Upstream bug fixes and toolchain hardening accumulated in ROCm 7.2.0 -> 7.2.3 across comgr, device-libs and hipcc. * Compatibility with the LLVM 22 toolchain that the rest of the ROCm 7.2.x stack expects. The SRU updates rocm-llvm to 7.2.3+dfsg-0ubuntu1, the latest 7.2.x point release. The mechanism of the fix is a new upstream release: the package is rebased to upstream ROCm 7.2.3, the build is moved onto the LLVM 22 toolchain (clang-22 / libclang-cpp22-dev / libclang-rt-22-dev), the binary rocm-device-libs-21 is renamed to rocm-device-libs-22 to match, and four delta patches that were either applied upstream or that targeted LLVM 21 are dropped. A single new patch (llvm22-options-header-rename.patch) adapts comgr to the LLVM 22 clang/Options/ split. This update is a prerequisite for the rest of the ROCm 7.2.x stack (rocr-runtime, rocblas, rccl, rocthrust, rocwmma, amdsmi, ...) to be SRUed into resolute; without it those packages either fail to build or fail at runtime against the older rocm-llvm 7.1.1. [Test Plan] The package builds on amd64, arm64 and ppc64el. The verification steps below assume an amd64 host with `-proposed` enabled. 1. Enable -proposed for resolute and refresh: sudo sed -i 's/^# *deb /deb /' /etc/apt/sources.list.d/ubuntu.sources \ || true printf 'Types: deb\nURIs: http://archive.ubuntu.com/ubuntu\n'\ 'Suites: resolute-proposed\nComponents: main universe\n'\ 'Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n' \ | sudo tee /etc/apt/sources.list.d/ubuntu-proposed.sources sudo apt update 2. Confirm the candidate version is the SRU: apt-cache policy hipcc libamd-comgr3 libamd-comgr-dev \ rocm-device-libs-22 # Expect 7.2.3+dfsg-0ubuntu1 as the -proposed candidate. 3. Install the SRU candidate and the matching toolchain: sudo apt install -t resolute-proposed \ hipcc libamd-comgr3 libamd-comgr-dev rocm-device-libs-22 \ clang-22 libclang-cpp22-dev libclang-rt-22-dev 4. Sanity-check the comgr ABI is intact (libamd-comgr3 SONAME): dpkg -L libamd-comgr3 | grep -E 'libamd_comgr\.so' readelf -d /usr/lib/*/libamd_comgr.so.3 | grep SONAME # Expect: SONAME = libamd_comgr.so.3 5. Compile a trivial HIP program against hipcc to exercise comgr and the device-libs end-to-end (no GPU required for `--cuda-device-only -emit-llvm` style checks): cat > /tmp/hip_smoke.hip <<'EOF' #include <hip/hip_runtime.h> __global__ void k(int *p) { p[threadIdx.x] = threadIdx.x; } int main() { int *p = nullptr; hipMalloc(&p, 64*sizeof(int)); hipLaunchKernelGGL(k, dim3(1), dim3(64), 0, 0, p); hipDeviceSynchronize(); return 0; } EOF hipcc --offload-arch=gfx1100 -c -o /tmp/hip_smoke.o /tmp/hip_smoke.hip # Repeat with --offload-arch=gfx1250 to exercise the new GFX12.5 # path that motivated the LLVM 22 bump. Expect a clean compile. 7. (Optional, with a supported GPU) Run the rocr-runtime / rocminfo smoke test from -proposed to confirm runtime loading still works against the rebuilt libamd-comgr3. 8. Run the package's own autopkgtests: autopkgtest -U rocm-llvm=7.2.3+dfsg-0ubuntu1 -- lxd ubuntu:resolute The update is considered verified when steps 4, 5, 6 and 8 all pass on the -proposed candidate and the equivalent steps still pass on the release pocket after the SRU is published. [Where problems could occur] The change is a new upstream version, not a targeted patch, so the blast radius is the union of (a) the upstream 7.1.1 -> 7.2.3 delta in comgr / device-libs / hipcc and (b) the toolchain move from LLVM 21 to LLVM 22. Concretely, problems could appear in: * libamd-comgr3: the comgr action API is consumed at runtime by every ROCm component that JIT-compiles or relocates GPU code (rocr-runtime, HIP, OpenCL, rocBLAS tuning, etc.). The SONAME stays at libamd_comgr.so.3, but any silent behavioural change in action-codegen / action-link / metadata parsing would surface as runtime failures in those consumers. The new llvm22-options-header-rename.patch reaches into clang's driver internals (GetResourcesPath / clang/Options/Options.h); if the LLVM 22 packaging in resolute differs subtly from what the patch assumes, comgr could mis-resolve the clang resource directory and fail to find builtin headers or device-libs at runtime. * rocm-device-libs-22 (renamed from rocm-device-libs-21): consumers that hard-coded a dependency on rocm-device-libs-21 will not pull the new binary automatically. The rest of the ROCm 7.2.x stack needs to be rebuilt against the renamed package; until those rebuilds land in -proposed alongside this SRU, mixed installs could end up with neither -21 nor -22 satisfied. * hipcc: changes to the driver wrapper or to default include / link paths can break out-of-tree HIP builds in non-obvious ways (e.g. picking up the wrong libclang_rt, or losing a default --offload-arch). Regressions here usually present as link-time "undefined reference" errors or as runtime "no kernel image available" errors on previously-working GPUs. * Architecture coverage: the build is exercised on amd64, arm64 and ppc64el. arm64 and ppc64el have historically been the long pole for ROCm rebuilds; a successful build there does not guarantee that downstream rebuilds against this rocm-llvm will succeed on the same architectures. Mitigations: the SRU is gated behind successful builds on all three release architectures, the autopkgtests above, and verification of the end-to-end hipcc -> device-libs -> comgr path on a real GPU. The upstream 7.2.3 tag is the same source the rest of the ROCm 7.2.x SRU train is built against, so any divergence between this package and its consumers is bounded by the packaging delta, which is small and reviewed (4 patches, see debian/patches/series). ** Affects: rocm-llvm (Ubuntu) Importance: Undecided Assignee: Talha Can Havadar (tchavadar) Status: New ** Affects: rocm-llvm (Ubuntu Resolute) Importance: Undecided Status: New ** Summary changed: - [SRU] rocm-llvm 7.2.3+dfsg-0ubuntu1 for resolute + [SRU] rocm-llvm 7.2.3 ** Also affects: rocm-llvm (Ubuntu Resolute) Importance: Undecided Status: New ** Changed in: rocm-llvm (Ubuntu) Assignee: (unassigned) => Talha Can Havadar (tchavadar) -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2153424 Title: [SRU] rocm-llvm 7.2.3 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/rocm-llvm/+bug/2153424/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
