[Bug target/124127] New: [16 Regression] 9% slowdown of 503.bwaves_r on Zen3 since r16-1644-gaba3b9d3a48a07

pheeck at gcc dot gnu.org via Gcc-bugs Mon, 16 Feb 2026 06:17:54 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=124127


            Bug ID: 124127
           Summary: [16 Regression] 9% slowdown of 503.bwaves_r on Zen3
                    since r16-1644-gaba3b9d3a48a07
           Product: gcc
           Version: 16.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: pheeck at gcc dot gnu.org
                CC: hjl at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

** This bug was split out from pr120957 **

As seen here

https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=471.427.0

there was a 9% exec time slowdown of the 503.bwaves SPEC 2017
benchmark when run with -Ofast -march=native on an AMD Zen3 machine.

I bisected the slowdown to r16-1644-gaba3b9d3a48a07

commit aba3b9d3a48a0703fd565f7c5f0caf604f59970b
Author:     H.J. Lu <[email protected]>
AuthorDate: Fri May 9 07:17:07 2025 +0800
Date:   Fri May 9 07:17:07 2025 +0800
Commit:     H.J. Lu <[email protected]>
CommitDate: Tue Jun 24 14:02:56 2025 +0800

    x86: Extend the remove_redundant_vector pass

    Extend the remove_redundant_vector pass to handle vector broadcasts from
    constant and variable scalars.  When broadcasting from constants and
    function arguments, we can place a single widest vector broadcast at
    entry of the nearest common dominator for basic blocks with all uses
    since constants and function arguments aren't changed.  For broadcast
    from variables with a single definition, the single definition is
    replaced with the widest broadcast.


When this slowdown was introduced, bwaves was only slightly slower with GCC16
than with GCC15, but GCC15 improved since then so I guess this can be
considered a regression.  See the comparison here:
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=1064.427.0&plot.1=1181.427.0&plot.2=471.427.0&;


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

[Bug target/124127] New: [16 Regression] 9% slowdown of 503.bwaves_r on Zen3 since r16-1644-gaba3b9d3a48a07

Reply via email to