The following allows vectorizing the gcc.target/i386/pr111023*.c
testcases again with -m32 -msse2 by ensuring we see through a cast
when looking for memory or vector extract sources during costing
of vector construction.
This, together with the forwprop fix fixes the regression on those testcases.
Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
OK if that succeeds?
Thanks,
Richard.
PR target/120234
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
For constructor elements always look through a conversion
when determining whether to cost a gpr<->xmm move.
---
gcc/config/i386/i386.cc | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 52f82185e32..101fd80208e 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -26427,13 +26427,12 @@ ix86_vector_costs::add_stmt_cost (int count,
vect_cost_for_stmt kind,
TREE_VISITED (op) = 1;
gimple *def = SSA_NAME_DEF_STMT (op);
tree tem;
+ /* Look through a conversion, those can often be merged with
+ a load or extract. */
if (is_gimple_assign (def)
&& CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def))
&& ((tem = gimple_assign_rhs1 (def)), true)
- && TREE_CODE (tem) == SSA_NAME
- /* A sign-change expands to nothing. */
- && tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (def)),
- TREE_TYPE (tem)))
+ && TREE_CODE (tem) == SSA_NAME)
def = SSA_NAME_DEF_STMT (tem);
/* When the component is loaded from memory we can directly
move it to a vector register, otherwise we have to go
--
2.51.0