[Bug tree-optimization/120751] [16 Regression] 10-15% slowdown of 454.calculix on Zen4 and Zen5 since r16-1001-g0291f53f8d2343

rguenth at gcc dot gnu.org via Gcc-bugs Mon, 19 Jan 2026 07:01:54 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120751


--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Of course that we use an in-order reduction with two lanes, including build of
the data vector from scalars, to save 4 scalar multiplications, is a bit
on the border of profitability.  Per-stmt local costing is difficult here
though.

There's another PR about in-order reductions being somewhat pointless, but
in a case where we save nothing but loads (which in this case we don't).

[Bug tree-optimization/120751] [16 Regression] 10-15% slowdown of 454.calculix on Zen4 and Zen5 since r16-1001-g0291f53f8d2343

Reply via email to