Since it regressed SPEC performance(Refer to PR121994), I guess
it's related to register pressure and can be tuned by adjusting
reduc_lat_mult_thr. I don't have Zen2 machine, so for simplity, I'll
just disable unroll in vectorizer for Zen2.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ready push to trunk.

gcc/ChangeLog:

        PR target/121994
        * config/i386/x86-tune-costs.h (znver2_cost): Set
        vect_unroll_limit to 1.
---
 gcc/config/i386/x86-tune-costs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h
index 1649ea2fe3e..e9e4ddf108a 100644
--- a/gcc/config/i386/x86-tune-costs.h
+++ b/gcc/config/i386/x86-tune-costs.h
@@ -1918,7 +1918,7 @@ struct processor_costs znver2_cost = {
                                           FMA/DOT_PROD_EXPR/SAD_EXPR,
                                           it's used to determine unroll
                                           factor in the vectorizer.  */
-  4,                                   /* Limit how much the autovectorizer
+  1,                                   /* Limit how much the autovectorizer
                                           may unroll a loop.  */
   znver2_memcpy,
   znver2_memset,
-- 
2.34.1

Reply via email to