Public bug reported:

[ Impact ]

 * rocFFT 7.2.4 (library version 1.0.36) fixes three correctness issues present
   in 7.1.1:

   - A potential division-by-zero crash when constructing FFT plans using
     dimensions of length 1. Affected callers would see a hard crash or
     undefined behaviour at plan creation time rather than a clean error.

   - Incorrect result scaling on multi-device transforms. Applications using
     rocFFT across multiple GPUs could silently produce wrong numerical
     results without any error being raised.

   - Broken callbacks on multi-device transforms. Custom load/store callbacks
     registered by the caller were not being invoked correctly when the
     transform spanned multiple devices, leading to silent data corruption
     for any application relying on that feature.

 * Additionally, two performance improvements are included: removal of an
   unnecessary global transpose from MPI 3D multi-GPU pencil decompositions,
   and enablement of the same optimisation for single-process multi-GPU
   transforms. These reduce latency and memory traffic for HPC workloads
   using distributed FFTs.

 * Without this update, users on resolute running multi-GPU or length-1
   FFT workloads may experience crashes, silent wrong results, or missed
   callbacks. The fixes are present only in the 7.2.x upstream branch and
   cannot be cherry-picked without pulling in the full release.

 * Reverse dependencies: librocfft-dev (headers only); no other Ubuntu
   archive package links against librocfft0 at this time.

[ Test Plan ]

 1. Build:
    - dpkg-buildpackage -S succeeds (source package builds cleanly).
    - dpkg --compare-versions 7.2.4-0ubuntu1 gt 7.1.1-0ubuntu1 returns true.
    - No symbols file is shipped (library uses a version script to restrict
      exported symbols to the rocfft_ prefix); ABI was verified with abidiff:
        abidiff librocfft.so.0.1 (7.1.1) vs librocfft.so.0.1 (7.2.4)
        → exit code 0, no output. SOVERSION unchanged at 0.

 2. Installability:
    - apt install librocfft0 librocfft-dev succeeds.
    - No reverse dependencies require a rebuild.

 3. Autopkgtest (test: librocfft0-tests):
    Full run against ppa:igorluppi/rocfft-7.2.3, LXD rocm-gpu profile,
    Ubuntu resolute, amd64 (2026-06-01):

      Number of successful tests: 95436
      Number of skipped tests: 120356
      Number of runtime issues: 0
       2 FAILED TESTS
      Test suite took 3 hours 18 minutes

      half precision max l-inf epsilon: 0.000746231
      single precision max l-inf epsilon: 0.0391119
      double precision max l-inf epsilon: 4.58786e-16

      autopkgtest summary:
      librocfft0-tests     FAIL non-zero exit status 1

    The 2 failures are GPU out-of-memory conditions on extreme test sizes
    (buffer requirements of 1.5–2.1 GB), not correctness failures:
      - pow2_1D complex forward len 268435456, single, batch 1 → 2.1 GB
      - pow5_1D complex forward len 48828125,  single, batch 4 → 1.56 GB

    These failures are non-deterministic (different tests fail per run due
    to random seed and GPU memory state) and predate this version bump:
    journalctl on the test machine shows rocfft-test OOM-killed by the
    kernel on May 19 and May 26 (anon-rss ~22 GB) during 7.1.x testing.

    A targeted re-run of the previously failing tests with --gtest_filter
    passed cleanly: 20/20, 0 failures (run: 2026-05-29, 8m 54s, PASS).

[ Where problems could occur ]

 * FFT correctness regression: if the upstream fixes for multi-device
   scaling or callbacks introduced a new bug, applications running
   multi-GPU transforms could produce wrong numerical results. Symptom:
   output values differ from single-GPU reference or from FFTW, detectable
   by the accuracy test suite.

 * Plan creation regression: if the division-by-zero fix interacted badly
   with other plan construction paths, rocfft_plan_create could return an
   unexpected error status or crash for previously working problem sizes.
   Symptom: applications fail at plan creation with a non-zero return code.

 * Callback regression: if the multi-device callback fix broke the
   single-device callback path, applications using load/store callbacks
   would see their callback never invoked or invoked with wrong arguments.
   Symptom: silent wrong output in callback-dependent code paths.

 * ABI break: although abidiff reports no changes, a missed symbol removal
   would manifest as an immediate SIGILL or undefined-symbol error when
   loading the library. SOVERSION is unchanged so the package manager will
   not flag this; it would only appear at runtime.

[ Other Info ]

 * ABI: abidiff 7.1.1 → 7.2.4 reports no changes (exit 0, no output).
   SOVERSION unchanged at 0 (librocfft.so.0). No debian/*.symbols file;
   symbol visibility is controlled by a version script (patch
   Add-version-script-to-control-exposed-symbols.patch).

 * This update is part of the coordinated ROCm 7.2.x stack release for
   Ubuntu stonking (26.10) and resolute (26.04 LTS).

 * PPA: https://launchpad.net/~igorluppi/+archive/ubuntu/rocfft-7.2.3

 * Upstream version comparison:
   https://github.com/ROCm/rocm-libraries/compare/rocm-7.1.1...rocm-7.2.4

** Affects: rocfft (Ubuntu)
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2155392

Title:
  New upstream version 7.2.4

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/rocfft/+bug/2155392/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to