On Tue, Jul 8, 2025 at 7:26 PM Richard Biener <richard.guent...@gmail.com> wrote: > > On Tue, Jul 8, 2025 at 12:48 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > aba3b9d3a48a0703fd565f7c5f0caf604f59970b is the first bad commit > > commit aba3b9d3a48a0703fd565f7c5f0caf604f59970b > > Author: H.J. Lu <hjl.to...@gmail.com> > > Date: Fri May 9 07:17:07 2025 +0800 > > > > x86: Extend the remove_redundant_vector pass > > > > which removed non all 0s/1s redundant vector loads, caused SPEC CPU 2017 > > 519.lbm_r and 470.lbm performance regressions on AMD znverN processors. > > Add a tuning option to keep non all 0s/1s redundant vector loads on AMD > > znverN processors. > > Do we know what actually happens here or is this basically reverting the > change > based on a new tunable and the reported regression? > > If I read the pass correctly it might insert broadcasts on paths where > not originally > computed (it inserts after the scalar def, which might be far away). > ix86_broadcast_inner > suggests it replaces extracts from a broadcast with the original > broadcast value/register > which means it might increase lifetime of the broadcast register. > > Both shouldn't be causing specifically regressions on Zen2, but can be > bad. I think > we need to understand better what the pass does (it's written without > much commentary, > so I tried to quickly reverse engineer it), and improve it, avoiding > cases where it > obviously increases register lifetime.
The regression doesn't show up on Intel processors. This regression is specific to AMD processors. If there is a small testcase, I will find a different way to fix it. > > gcc/ > > > > PR target/120941 > > * config/i386/i386-features.cc (ix86_broadcast_inner): Keep > > non all 0s/1s redundant vector loads if asked. > > * config/i386/x86-tune.def (X86_TUNE_KEEP_REDUNDANT_VECTOR_LOAD): > > New tuning. > > > > gcc/testsuite/ > > > > PR target/120941 > > * gcc.target/i386/pr120941-1a.c: New test. > > * gcc.target/i386/pr120941-1b.c: Likewise. > > * gcc.target/i386/pr120941-1c.c: Likewise. > > * gcc.target/i386/pr120941-1d.c: Likewise. > > > > OK for master? > > > > Thanks. > > > > -- > > H.J. -- H.J.