Since it regressed SPEC performance(Refer to PR121994), I guess
it's related to register pressure and can be tuned by adjusting
reduc_lat_mult_thr. I don't have Zen2 machine, so for simplity, I'll
just disable unroll in vectorizer for Zen2.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk.
gcc/ChangeLog:
PR target/121994
* config/i386/x86-tune-costs.h (znver2_cost): Set
vect_unroll_limit to 1.
---
gcc/config/i386/x86-tune-costs.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 1649ea2fe3e..e9e4ddf108a 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1918,7 +1918,7 @@ struct processor_costs znver2_cost = {
FMA/DOT_PROD_EXPR/SAD_EXPR,
it's used to determine unroll
factor in the vectorizer. */
- 4, /* Limit how much the autovectorizer
+ 1, /* Limit how much the autovectorizer
may unroll a loop. */
znver2_memcpy,
znver2_memset,
--
2.34.1