On Tue, Feb 24, 2026 at 10:24 PM Richard Biener <[email protected]> wrote: > > On Tue, 24 Feb 2026, Richard Biener wrote: > > > The following allows vectorizing the gcc.target/i386/pr111023*.c > > testcases again with -m32 -msse2 by ensuring we see through a cast > > when looking for memory or vector extract sources during costing > > of vector construction. > > > > This, together with the forwprop fix fixes the regression on those > > testcases. > > > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > > > OK if that succeeds? > > While that succeeds experimenting shows that zero- and sign-extends > are not handled when moving from memory. I think we can do zero-extends > for SImode and DImode (movd/movq) and for smaller modes via pre-zeroing > of %xmm and pinsr. I'm leaving that for separate. Below is a revised > patch that cleans up the various conditions and only touches the > vector extract [ -> conversion ] -> vector CTOR path to allow all > conversions. > > Another option would be to not disable MMX <-> SSE conversion patterns > with -m32 or to revert another part of Honzas cost changes which regressed > those testcases (kill the * 2 multiplication). > > Re-testing below patch. > > OK?
LGTM. > > Thanks, > Richard. > > From ac2a80af61d57ff686dbdbd97095e1c329c250e5 Mon Sep 17 00:00:00 2001 > From: Richard Biener <[email protected]> > Date: Tue, 24 Feb 2026 09:53:00 +0100 > Subject: [PATCH] target/120234 - adjust vector construction costs > To: [email protected] > > The following allows vectorizing the gcc.target/i386/pr111023*.c > testcases again with -m32 -msse2 by ensuring we see through a cast > when looking for vector extract sources during costing of vector construction. > > This, together with the forwprop fix fixes the regression on those testcases. > > PR target/120234 > * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): > For constructor elements always look through a conversion. > Rewrite load and vector extraction matching to be more obvious. > Allow arbitrary conversions from the vector extract to elide > costing of a gpr<->xmm move. > --- > gcc/config/i386/i386.cc | 35 +++++++++++++++++++---------------- > 1 file changed, 19 insertions(+), 16 deletions(-) > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc > index 52f82185e32..acedc73b825 100644 > --- a/gcc/config/i386/i386.cc > +++ b/gcc/config/i386/i386.cc > @@ -26427,26 +26427,29 @@ ix86_vector_costs::add_stmt_cost (int count, > vect_cost_for_stmt kind, > TREE_VISITED (op) = 1; > gimple *def = SSA_NAME_DEF_STMT (op); > tree tem; > + /* Look through a conversion. */ > if (is_gimple_assign (def) > && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) > && ((tem = gimple_assign_rhs1 (def)), true) > - && TREE_CODE (tem) == SSA_NAME > - /* A sign-change expands to nothing. */ > - && tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (def)), > - TREE_TYPE (tem))) > + && TREE_CODE (tem) == SSA_NAME) > def = SSA_NAME_DEF_STMT (tem); > - /* When the component is loaded from memory we can directly > - move it to a vector register, otherwise we have to go > - via a GPR or via vpinsr which involves similar cost. > - Likewise with a BIT_FIELD_REF extracting from a vector > - register we can hope to avoid using a GPR. */ > - if (!is_gimple_assign (def) > - || ((!gimple_assign_load_p (def) > - || (!TARGET_SSE4_1 > - && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1)) > - && (gimple_assign_rhs_code (def) != BIT_FIELD_REF > - || !VECTOR_TYPE_P (TREE_TYPE > - (TREE_OPERAND (gimple_assign_rhs1 (def), > 0)))))) > + /* When the component is loaded from memory without sign- > + or zero-extension we can move it to a vector register and/or > + insert it via vpinsr with a memory operand. */ > + if (gimple_assign_load_p (def) > + && tree_nop_conversion_p (TREE_TYPE (op), > + TREE_TYPE (gimple_assign_lhs (def))) > + && (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) > 1 > + || TARGET_SSE4_1)) > + ; > + /* When the component is extracted from a vector it is already > + in a vector register. */ > + else if (is_gimple_assign (def) > + && gimple_assign_rhs_code (def) == BIT_FIELD_REF > + && VECTOR_TYPE_P (TREE_TYPE > + (TREE_OPERAND (gimple_assign_rhs1 (def), 0)))) > + ; > + else > { > if (fp) > { > -- > 2.51.0 > -- BR, Hongtao
