On Tue, May 08, 2018 at 01:25:35PM +0200, Allan Sandfeld Jensen wrote: > 2018-05-08 Allan Sandfeld Jensen <allan.jen...@qt.io>
2 spaces between date and name and two spaces between name and email address. > gcc/ > > PR tree-optimization/85692 > * tree-ssa-forwprop.c (simplify_vector_constructor): Try two > source permute as well. > > gcc/testsuite > > * gcc.target/i386/pr85692.c: Test two simply constructions are > detected as permute instructions. Just * gcc.target/i386/pr85692.c: New test. > > diff --git a/gcc/testsuite/gcc.target/i386/pr85692.c > b/gcc/testsuite/gcc.target/i386/pr85692.c > new file mode 100644 > index 00000000000..322c1050161 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr85692.c > @@ -0,0 +1,18 @@ > +/* { dg-do run } */ > +/* { dg-options "-O2 -msse4.1" } */ > +/* { dg-final { scan-assembler "unpcklps" } } */ > +/* { dg-final { scan-assembler "blendps" } } */ > +/* { dg-final { scan-assembler-not "shufps" } } */ > +/* { dg-final { scan-assembler-not "unpckhps" } } */ > + > +typedef float v4sf __attribute__ ((vector_size (16))); > + > +v4sf unpcklps(v4sf a, v4sf b) > +{ > + return v4sf{a[0],b[0],a[1],b[1]}; Though, not really sure if this has been tested at all. The above is valid only in C++ (and only C++11 and above), while the test is compiled as C and thus has to fail. In C one should use e.g. return (v4sf){a[0],b[0],a[1],b[1]}; instead (i.e. a compound literal). > @@ -2022,8 +2022,9 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi) > elem_type = TREE_TYPE (type); > elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); > > - vec_perm_builder sel (nelts, nelts, 1); > - orig = NULL; > + vec_perm_builder sel (nelts, 2, nelts); Why this change? I admit the vec_parm_builder arguments are confusing, but I think the second times third is the number of how many indices are being pushed into the vector, so I think (nelts, nelts, 1) is right. > @@ -2063,10 +2064,26 @@ simplify_vector_constructor (gimple_stmt_iterator > *gsi) > return false; > op1 = gimple_assign_rhs1 (def_stmt); > ref = TREE_OPERAND (op1, 0); > - if (orig) > + if (orig1) > { > - if (ref != orig) > - return false; > + if (ref == orig1 || orig2) > + { > + if (ref != orig1 && ref != orig2) > + return false; > + } > + else > + { > + if (TREE_CODE (ref) != SSA_NAME) > + return false; > + if (! VECTOR_TYPE_P (TREE_TYPE (ref)) > + || ! useless_type_conversion_p (TREE_TYPE (op1), > + TREE_TYPE (TREE_TYPE (ref)))) > + return false; > + if (TREE_TYPE (orig1) != TREE_TYPE (ref)) > + return false; I think even different type is acceptable here, as long as its conversion to orig1's type is useless. Furthermore, I think the way you wrote the patch with 2 variables rather than an array of 2 elements means too much duplication, this else block is a duplication of the else block below. See the patch I've added to the PR (and sorry for missing your patch first, the PR wasn't ASSIGNED and there was no link to gcc-patches for it). > @@ -2125,15 +2147,14 @@ simplify_vector_constructor (gimple_stmt_iterator > *gsi) > return false; > op2 = vec_perm_indices_to_tree (mask_type, indices); > if (conv_code == ERROR_MARK) > - gimple_assign_set_rhs_with_ops (gsi, VEC_PERM_EXPR, orig, orig, op2); > + gimple_assign_set_rhs_with_ops (gsi, VEC_PERM_EXPR, orig1, orig2, op2); > else > { > gimple *perm > - = gimple_build_assign (make_ssa_name (TREE_TYPE (orig)), > - VEC_PERM_EXPR, orig, orig, op2); > - orig = gimple_assign_lhs (perm); > + = gimple_build_assign (make_ssa_name (TREE_TYPE (orig1)), > + VEC_PERM_EXPR, orig1, orig2, op2); > gsi_insert_before (gsi, perm, GSI_SAME_STMT); > - gimple_assign_set_rhs_with_ops (gsi, conv_code, orig, > + gimple_assign_set_rhs_with_ops (gsi, conv_code, gimple_assign_lhs > (perm), Too long line. Jakub