** Description changed:

  [Impact]
  
  Resolute (26.04 LTS) currently ships rocm-llvm 7.1.1+dfsg-0ubuntu1, which
  predates the ROCm 7.2.x point releases. Users on resolute therefore lack:
  
   * device-libs and comgr support for hardware enabled in ROCm 7.2.x
     (notably the GFX12.5 cluster intrinsics and other newer AMDGPU
     subtarget features), which means downstream ROCm components built
     against this rocm-llvm cannot generate or load code for these
     targets.
   * Upstream bug fixes and toolchain hardening accumulated in ROCm
     7.2.0 -> 7.2.3 across comgr, device-libs and hipcc.
   * Compatibility with the LLVM 22 toolchain that the rest of the ROCm
     7.2.x stack expects.
  
  The SRU updates rocm-llvm to 7.2.3+dfsg-0ubuntu1, the latest 7.2.x
  point release. The mechanism of the fix is a new upstream release: the
  package is rebased to upstream ROCm 7.2.3, the build is moved onto the
  LLVM 22 toolchain (clang-22 / libclang-cpp22-dev / libclang-rt-22-dev),
  the binary rocm-device-libs-21 is renamed to rocm-device-libs-22 to
  match, and four delta patches that were either applied upstream or that
  targeted LLVM 21 are dropped. A single new patch
  (llvm22-options-header-rename.patch) adapts comgr to the LLVM 22
  clang/Options/ split.
  
  This update is a prerequisite for the rest of the ROCm 7.2.x stack
  (rocr-runtime, rocblas, rccl, rocthrust, rocwmma, amdsmi, ...) to be
  SRUed into resolute; without it those packages either fail to build or
  fail at runtime against the older rocm-llvm 7.1.1.
  
  [Test Plan]
  
  The package builds on amd64, arm64 and ppc64el. The verification steps
  below assume an amd64 host with `-proposed` enabled.
  
    1. Enable -proposed for resolute and refresh:
         sudo sed -i 's/^# *deb /deb /' /etc/apt/sources.list.d/ubuntu.sources \
           || true
         printf 'Types: deb\nURIs: http://archive.ubuntu.com/ubuntu\n'\
  'Suites: resolute-proposed\nComponents: main universe\n'\
  'Signed-By: /usr/share/keyrings/ubuntu-archive-keyring.gpg\n' \
           | sudo tee /etc/apt/sources.list.d/ubuntu-proposed.sources
         sudo apt update
  
    2. Confirm the candidate version is the SRU:
         apt-cache policy hipcc libamd-comgr3 libamd-comgr-dev \
                          rocm-device-libs-22
         # Expect 7.2.3+dfsg-0ubuntu1 as the -proposed candidate.
  
    3. Install the SRU candidate and the matching toolchain:
         sudo apt install -t resolute-proposed \
           hipcc libamd-comgr3 libamd-comgr-dev rocm-device-libs-22 \
           clang-22 libclang-cpp22-dev libclang-rt-22-dev
  
    4. Sanity-check the comgr ABI is intact (libamd-comgr3 SONAME):
         dpkg -L libamd-comgr3 | grep -E 'libamd_comgr\.so'
         readelf -d /usr/lib/*/libamd_comgr.so.3 | grep SONAME
         # Expect: SONAME = libamd_comgr.so.3
  
    5. Compile a trivial HIP program against hipcc to exercise comgr and
       the device-libs end-to-end (no GPU required for `--cuda-device-only
       -emit-llvm` style checks):
  
         cat > /tmp/hip_smoke.hip <<'EOF'
         #include <hip/hip_runtime.h>
         __global__ void k(int *p) { p[threadIdx.x] = threadIdx.x; }
         int main() {
           int *p = nullptr;
           hipMalloc(&p, 64*sizeof(int));
           hipLaunchKernelGGL(k, dim3(1), dim3(64), 0, 0, p);
           hipDeviceSynchronize();
           return 0;
         }
         EOF
         hipcc --offload-arch=gfx1100 -c -o /tmp/hip_smoke.o /tmp/hip_smoke.hip
         # Repeat with --offload-arch=gfx1250 to exercise the new GFX12.5
         # path that motivated the LLVM 22 bump. Expect a clean compile.
  
    7. (Optional, with a supported GPU) Run the rocr-runtime / rocminfo
       smoke test from -proposed to confirm runtime loading still works
       against the rebuilt libamd-comgr3.
  
    8. Run the package's own autopkgtests:
         autopkgtest -U rocm-llvm=7.2.3+dfsg-0ubuntu1 -- lxd ubuntu:resolute
  
  The update is considered verified when steps 4, 5, 6 and 8 all pass on
  the -proposed candidate and the equivalent steps still pass on the
  release pocket after the SRU is published.
  
  [Where problems could occur]
  
  The change is a new upstream version, not a targeted patch, so the
  blast radius is the union of (a) the upstream 7.1.1 -> 7.2.3 delta in
  comgr / device-libs / hipcc and (b) the toolchain move from LLVM 21 to
  LLVM 22. Concretely, problems could appear in:
  
   * libamd-comgr3: the comgr action API is consumed at runtime by every
     ROCm component that JIT-compiles or relocates GPU code (rocr-runtime,
     HIP, OpenCL, rocBLAS tuning, etc.). The SONAME stays at
     libamd_comgr.so.3, but any silent behavioural change in
     action-codegen / action-link / metadata parsing would surface as
     runtime failures in those consumers. The new
     llvm22-options-header-rename.patch reaches into clang's driver
     internals (GetResourcesPath / clang/Options/Options.h); if the LLVM
     22 packaging in resolute differs subtly from what the patch assumes,
     comgr could mis-resolve the clang resource directory and fail to
     find builtin headers or device-libs at runtime.
  
   * rocm-device-libs-22 (renamed from rocm-device-libs-21): consumers
     that hard-coded a dependency on rocm-device-libs-21 will not pull
     the new binary automatically. The rest of the ROCm 7.2.x stack
     needs to be rebuilt against the renamed package; until those
     rebuilds land in -proposed alongside this SRU, mixed installs could
     end up with neither -21 nor -22 satisfied.
  
   * hipcc: changes to the driver wrapper or to default include / link
     paths can break out-of-tree HIP builds in non-obvious ways
     (e.g. picking up the wrong libclang_rt, or losing a default
     --offload-arch). Regressions here usually present as link-time
     "undefined reference" errors or as runtime "no kernel image
     available" errors on previously-working GPUs.
  
   * Architecture coverage: the build is exercised on amd64, arm64 and
     ppc64el. arm64 and ppc64el have historically been the long pole for
     ROCm rebuilds; a successful build there does not guarantee that
     downstream rebuilds against this rocm-llvm will succeed on the same
     architectures.
  
  Mitigations: the SRU is gated behind successful builds on all three
  release architectures, the autopkgtests above, and verification of the
  end-to-end hipcc -> device-libs -> comgr path on a real GPU. The
  upstream 7.2.3 tag is the same source the rest of the ROCm 7.2.x SRU
  train is built against, so any divergence between this package and its
  consumers is bounded by the packaging delta, which is small and
  reviewed (4 patches, see debian/patches/series).
+ 
+ ABI/API Compatibility Report
+ 
+ === Comparing libamd-comgr3 ===
+ Running: abipkgdiff 
/home/ubuntu/actions-runner/_work/bullwinkle-cicd/bullwinkle-cicd/old/libamd-comgr3_7.1.1+dfsg-0ubuntu1_amd64.deb
 
/home/ubuntu/actions-runner/_work/bullwinkle-cicd/bullwinkle-cicd/new/libamd-comgr3_7.2.3+dfsg-0ubuntu1~git202605201759.0cd6869_amd64.deb

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2153424

Title:
  [SRU] rocm-llvm 7.2.3

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rocm-llvm/+bug/2153424/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to