Public bug reported:

## SRU ##

[ Impact ]

 * amdsmi 7.2.3 (libamd-smi26) ships four CLI/library bug fixes over the
   7.2.0 release already in resolute:

   1. [SWDEV-560235] gpu_board and base_board temperature monitoring was
      implemented inline in amdsmi_commands.py, duplicating ~80 lines of
      identical logic in both the `metric` and `monitor` code paths. The
      refactored helpers get_gpu_board_temperatures / 
get_base_board_temperatures
      are extracted to amdsmi_helpers.py and two new `amd-smi monitor` flags
      are exposed: `--base-board-temps` (-b) and `--gpu-board-temps` (-o).
      Systems without the relevant sensors (e.g. MI100, consumer GPUs) see
      all values reported as N/A; no behaviour change for existing invocations.

   2. [SWDEV-566465] `amd-smi reset --pptpowerusage` produced malformed JSON
      when run across multiple GPUs. In the multi-device path print_output()
      was called before store_multiple_device_output(), so each GPU's result
      was flushed and the accumulation buffer was cleared prematurely. The
      fix moves the early-return guard after store_multiple_device_output().
      Single-GPU usage and non-JSON output are unaffected.

   3. Fix for CPER AFID list printing garbage values on invalid CPER files —
      When amdsmi_cper_decode encountered an unrecognised section type it fell
      through to the else branch (logging an error) but then unconditionally
      called afids.emplace_back() on the uninitialised afid variable. The fix
      initialises afid = -1 before the if/else chain and only appends when
      afid >= 0. Symptom: `amd-smi ras --cper` could print spurious integer
      values in the AFID column when fed a CPER file with unknown section types.

   4. [SWDEV-560828] `amd-smi ras --cper --follow --file <path>` sent all
      output to stdout instead of to the specified file. The CPER display
      helpers (display_cper_files_generated, _print_header, dump_cper_entries)
      were unconditionally calling print(); they now check the logger's
      destination and write to the file when one is set.

   The update also carries a packaging-only change: debian/watch and
   debian/copyright are updated to track amdsmi from the new ROCm/rocm-systems
   monorepo release assets instead of the individual upstream repository.

[ Test Plan ]

 1. Build:
    - sbuild or dpkg-buildpackage succeeds.
    - dpkg --compare-versions 7.2.3-0ubuntu1 gt 7.2.0-3 confirms the new
      version is greater.
    - Run dpkg-gensymbols against libgoamdsmi-shim64-1; confirm no symbols
      added, removed, or changed.

 2. Installability:
    - apt install amdsmi libamd-smi26 python3-amdsmi libgoamdsmi-shim64-1.
    - Confirm reverse dependencies remain installable without rebuild.

 3. Autopkgtest:
    - Run autopkgtest suite (run-amdsmi) on a GPU-equipped testbed.
    - All tests pass. Output:

      autopkgtest [17:55:00]: starting date and time: 2026-05-21 17:55:00+0000
      autopkgtest [17:55:00]: version 5.55
      autopkgtest [17:55:00]: host lxc-sessionizer-dev; command line: 
/usr/bin/autopkgtest -U deb-amdsmi/amdsmi 
'deb-amdsmi/amdsmi_7.2.3-0ubuntu1~ppa1~26.04_amd64.changes' -- lxd 
ubuntu-daily:resolute --profile=rocm-gpu
      autopkgtest [17:57:17]: testbed release detected to be: resolute
      autopkgtest [17:57:35]: test run-amdsmi: -----------------------]
      autopkgtest [17:57:35]: test run-amdsmi:  - - - - - - - - - - results - - 
- - - - - - - -
      run-amdsmi           PASS
      autopkgtest [17:57:35]: @@@@@@@@@@@@@@@@@@@@ summary
      run-amdsmi           PASS
      2026-05-21 17:57:37 - Autopkg tests ended for amdsmi.
      Tests took: 0h 2m 37s.

[ Where problems could occur ]

 1. JSON reset output (fix 2, low risk): Applications parsing the JSON output
    of `amd-smi reset --pptpowerusage` on multi-GPU systems were receiving
    incomplete or duplicated per-GPU blocks. The fix corrects the ordering;
    any code that was working around the broken output may need updating.

 2. CPER AFID list (fix 3, very low risk): Systems that were relying on the
    garbage AFID values to detect unknown section types will no longer see them.
    The correct behaviour is to omit invalid AFIDs entirely.

 3. CPER --follow --file redirection (fix 4, low risk): The change redirects
    CPER output away from stdout when --file is specified; any pipeline that
    was capturing stdout from `amd-smi ras --cper --follow --file` will now
    receive an empty stream (output goes to the file instead). This is the
    intended behaviour.

 4. New monitor flags (fix 1, no risk for existing invocations): The new
    --base-board-temps and --gpu-board-temps flags are purely additive; default
    monitor output is unchanged. On GPUs that do not expose these sensors all
    values will be N/A.

[ Other Info ]

 * No ABI breakage: debian/libgoamdsmi-shim64-1.symbols is unchanged between
   7.2.0 and 7.2.3. The SONAME of the main library remains libamd-smi.so.26.
 * This update is part of the coordinated ROCm 7.2.3 stack release.
 * PPA: https://launchpad.net/~igorluppi/+archive/ubuntu/amdsmi-7.2.3
 * Upstream version comparison:
   https://github.com/ROCm/amdsmi/compare/rocm-7.2.0...rocm-7.2.3
 * Target: resolute 26.04 LTS

** Affects: amdsmi (Ubuntu)
     Importance: Undecided
     Assignee: Igor Luppi (igorluppi)
         Status: New

** Changed in: amdsmi (Ubuntu)
     Assignee: (unassigned) => Igor Luppi (igorluppi)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2153816

Title:
  SRU: New upstream version 7.2.3

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/amdsmi/+bug/2153816/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to