On Mon, Aug 28, 2017 at 10:57 AM, Dominik Inführ <dominik.infu...@theobroma-systems.com> wrote: > Hi, > > As discussed on IRC: This patches fixes the calculation of the scalar costs > of SLP vectorization. I’ve added a test case and the auto_vec allocation is > now reused for all children of a node.
Looks good to me and thanks for the testcase. I'll include this in my next round of testing and make sure it's fixed for GCC 8. Thanks, Richard. > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c > b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c > new file mode 100644 > index 0000000..5121a88 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/x86_64/costmodel-vect-slp.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-fdump-tree-slp-details" } */ > + > +#define N 4 > + > +int s1[N], s2[N], s3[N]; > +void escape(int, int, int, int); > + > +void > +foo () > +{ > + int t1, t2, t3, t4; > + > + t1 = s1[0] + s2[0] + s3[0]; > + t2 = s1[1] + s2[1] + s3[1]; > + t3 = s1[2] + s2[2] + s3[2]; > + t4 = s1[3] + s2[3] + s3[3]; > + > + s3[0] = t1 - s1[0] * s2[0]; > + s3[1] = t2 - s1[1] * s2[1]; > + s3[2] = t3 - s1[2] * s2[2]; > + s3[3] = t4 - s1[3] * s2[3]; > + > + escape (t1, t2, t3, t4); > +} > + > +/* { dg-final { scan-tree-dump-not "vectorization is not profitable" "slp2" > } } */ > +/* { dg-final { scan-tree-dump "basic block vectorized" "slp2" } } */ > diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c > index 04ecaab..cffd19b 100644 > --- a/gcc/tree-vect-slp.c > +++ b/gcc/tree-vect-slp.c > @@ -2690,9 +2690,17 @@ vect_bb_slp_scalar_cost (basic_block bb, > scalar_cost += stmt_cost; > } > > + auto_vec<bool, 20> subtree_life; > + > FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) > if (SLP_TREE_DEF_TYPE (child) == vect_internal_def) > - scalar_cost += vect_bb_slp_scalar_cost (bb, child, life); > + { > + /* Do not directly pass LIFE to the recursive call, copy it to > + confine changes in the callee to the current child/subtree. */ > + subtree_life.safe_splice (*life); > + scalar_cost += vect_bb_slp_scalar_cost (bb, child, &subtree_life); > + subtree_life.truncate (0); > + } > > return scalar_cost; > } > >> On 04 Aug 2017, at 12:19, Richard Biener <richard.guent...@gmail.com> wrote: >> >> On Fri, Aug 4, 2017 at 12:08 PM, Dominik Inführ >> <dominik.infu...@theobroma-systems.com> wrote: >>> Hi, >>> >>> vect_bb_slp_scalar_cost computes the scalar cost of a SLP node. If there >>> are non-scalar uses of a definition, the costs for it and its operands >>> (children) are ignored. The vector LIFE is used to keep track of this and >>> an element is set to true, such that the def and its children are ignored. >>> But as soon as an element is set to true it is never undone, that means the >>> following sibling and parent nodes of the current node also stay ignored. >> >> Yes, that's intended. They are live as well because they are needed >> to compute the live scalar. >> >> This seems wrong to me, a simple fix would be to clone LIFE for every >> vector, such that changes to LIFE stay in the subtree: >>> >>> diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c >>> index 032a9444a5a..f919645f28b 100644 >>> --- a/gcc/tree-vect-slp.c >>> +++ b/gcc/tree-vect-slp.c >>> @@ -2590,6 +2590,11 @@ vect_bb_slp_scalar_cost (basic_block bb, >>> gimple *stmt; >>> slp_tree child; >>> >>> + auto_vec<bool, 20> subtree_life; >>> + subtree_life.safe_splice (*life); >>> + >>> + life = &subtree_life; >>> + >>> FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt) >>> { >>> unsigned stmt_cost; >>> >>> >