https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121994
Bug ID: 121994 Summary: [16 Regression] 15% slowdown of 538.imagick_r on AMD Zen2 since r16-3396-g9823624395a946 Product: gcc Version: 16.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: pheeck at gcc dot gnu.org CC: liuhongt at gcc dot gnu.org Blocks: 26163 Target Milestone: --- Host: x86_64-linux Target: x86_64-linux As seen here https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=287.507.0 there was a 15% exec time slowdown of the 538.imagick_r SPEC 2017 benchmark when compiled with -Ofast -march=native -flto on an AMD Zen2 machine. I bisected it to r16-3396-g9823624395a946. commit 9823624395a946bb08a74e5aa4fb5d8bcebacfdf Author: liuhongt <hongtao....@intel.com> Date: Mon Jul 28 18:06:06 2025 -0700 Enable unroll in the vectorizer when there's reduction for FMA/DOT_PROD_EXPR/SAD_EXPR Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 [Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)