Since kind == vec_perm may not be a real vec_perm, just a broadcast or simple load in BB vectorizer.
Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: * config/i386/i386.cc (ix86_vector_costs::finish_cost): Restrict tune avx256_avoid_vec_perm to loop vectorization only. --- gcc/config/i386/i386.cc | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 55c9b16dd38..5a02e12d634 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -26305,15 +26305,15 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs) && (exact_log2 (LOOP_VINFO_VECT_FACTOR (loop_vinfo).to_constant ()) > ceil_log2 (LOOP_VINFO_INT_NITERS (loop_vinfo)))) m_costs[vect_body] = INT_MAX; + + for (int i = 0; i != 3; i++) + if (m_num_avx256_vec_perm[i] + && TARGET_AVX256_AVOID_VEC_PERM) + m_costs[i] = INT_MAX; } ix86_vect_estimate_reg_pressure (); - for (int i = 0; i != 3; i++) - if (m_num_avx256_vec_perm[i] - && TARGET_AVX256_AVOID_VEC_PERM) - m_costs[i] = INT_MAX; - /* When X86_TUNE_AVX512_TWO_EPILOGUES is enabled arrange for both a AVX2 and a SSE epilogue for AVX512 vectorized loops. */ if (loop_vinfo -- 2.34.1