https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105617
--- Comment #26 from Richard Biener <rguenth at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #25) > in vectorization, store shouldn't have such high cost because once the > address is computed, the store instruction doesn't stall the pipeline > (except for memory dependencies that are hard for compilers to detect). I'm > experimenting with adjusting store costs: setting integer stores to > COST_N_INSNS (1) - 1 (minus 1 since integer can benefit from renaming and > have lower STLF costs), while keeping vector/floating-point stores at > COST_N_INSNS (1). This approach should discourage vectorization for integer > vector construction + vector store patterns, while still promoting > vectorization for floating-point operations in vector construct + vector > store scenarios. > > > I'm testing below patch to see if there's any surprise. Sth like this was also on my TODO list. I'd have gone further and made stores zero cost as we generally do not model AGU latency (also a zero would more likely show up issues). > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 52f82185e32..420104a04b2 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -25363,8 +25363,7 @@ ix86_default_vector_cost (enum vect_cost_for_stmt > type_of_cost, > : ix86_cost->int_load [2]) / 2; > > case scalar_store: > - return COSTS_N_INSNS (fp ? ix86_cost->sse_store[0] > - : ix86_cost->int_store [2]) / 2; > + return fp ? COSTS_N_INSNS (1) : COSTS_N_INSNS (1) - 1; why is FP more costly than int? > > case vector_stmt: > return ix86_vec_cost (mode, > @@ -25378,11 +25377,7 @@ ix86_default_vector_cost (enum vect_cost_for_stmt > type_of_cost, > return COSTS_N_INSNS (ix86_cost->sse_load[index]) / 2; > > case vector_store: > - index = sse_store_index (mode); > - /* See PR82713 - we may end up being called on non-vector type. */ > - if (index < 0) > - index = 2; > - return COSTS_N_INSNS (ix86_cost->sse_store[index]) / 2; > + return ix86_vec_cost (mode, ix86_cost->sse_op); > > case vec_to_scalar: > case scalar_to_vec: > @@ -25398,11 +25393,7 @@ ix86_default_vector_cost (enum vect_cost_for_stmt > type_of_cost, > return COSTS_N_INSNS (ix86_cost->sse_unaligned_load[index]) / 2; > > case unaligned_store: > - index = sse_store_index (mode); > - /* See PR82713 - we may end up being called on non-vector type. */ > - if (index < 0) > - index = 2; > - return COSTS_N_INSNS (ix86_cost->sse_unaligned_store[index]) / 2; > + return ix86_vec_cost (mode, ix86_cost->sse_op); > > case vector_gather_load: > return ix86_vec_cost (mode,
