> Am 20.07.2023 um 16:09 schrieb Richard Sandiford <richard.sandif...@arm.com>: > > Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> When we materialize a layout we push edge permutes to constant/external >> defs without checking we can actually do so. For externals defined >> by vector stmts rather than scalar components we can't. >> >> Bootstrapped and tested on x86_64-unknown-linux-gnu. >> >> OK? >> >> Thanks, >> Richard. >> >> PR tree-optimization/110742 >> * tree-vect-slp.cc (vect_optimize_slp_pass::get_result_with_layout): >> Do not materialize an edge permutation in an external node with >> vector defs. >> (vect_slp_analyze_node_operations_1): Guard purely internal >> nodes better. >> >> * g++.dg/torture/pr110742.C: New testcase. >> --- >> gcc/testsuite/g++.dg/torture/pr110742.C | 47 +++++++++++++++++++++++++ >> gcc/tree-vect-slp.cc | 8 +++-- >> 2 files changed, 53 insertions(+), 2 deletions(-) >> create mode 100644 gcc/testsuite/g++.dg/torture/pr110742.C >> >> diff --git a/gcc/testsuite/g++.dg/torture/pr110742.C >> b/gcc/testsuite/g++.dg/torture/pr110742.C >> new file mode 100644 >> index 00000000000..d41ac0479d2 >> --- /dev/null >> +++ b/gcc/testsuite/g++.dg/torture/pr110742.C >> @@ -0,0 +1,47 @@ >> +// { dg-do compile } >> + >> +struct HARD_REG_SET { >> + HARD_REG_SET operator~() const { >> + HARD_REG_SET res; >> + for (unsigned int i = 0; i < (sizeof(elts) / sizeof((elts)[0])); ++i) >> + res.elts[i] = ~elts[i]; >> + return res; >> + } >> + HARD_REG_SET operator&(const HARD_REG_SET &other) const { >> + HARD_REG_SET res; >> + for (unsigned int i = 0; i < (sizeof(elts) / sizeof((elts)[0])); ++i) >> + res.elts[i] = elts[i] & other.elts[i]; >> + return res; >> + } >> + unsigned long elts[4]; >> +}; >> +typedef const HARD_REG_SET &const_hard_reg_set; >> +inline bool hard_reg_set_subset_p(const_hard_reg_set x, const_hard_reg_set >> y) { >> + unsigned long bad = 0; >> + for (unsigned int i = 0; i < (sizeof(x.elts) / sizeof((x.elts)[0])); ++i) >> + bad |= (x.elts[i] & ~y.elts[i]); >> + return bad == 0; >> +} >> +inline bool hard_reg_set_empty_p(const_hard_reg_set x) { >> + unsigned long bad = 0; >> + for (unsigned int i = 0; i < (sizeof(x.elts) / sizeof((x.elts)[0])); ++i) >> + bad |= x.elts[i]; >> + return bad == 0; >> +} >> +extern HARD_REG_SET rr[2]; >> +extern int t[2]; >> +extern HARD_REG_SET nn; >> +static HARD_REG_SET mm; >> +void setup_reg_class_relations(void) { >> + HARD_REG_SET intersection_set, union_set, temp_set2; >> + for (int cl2 = 0; cl2 < 2; cl2++) { >> + temp_set2 = rr[cl2] & ~nn; >> + if (hard_reg_set_empty_p(mm) && hard_reg_set_empty_p(temp_set2)) { >> + mm = rr[0] & nn; >> + if (hard_reg_set_subset_p(mm, intersection_set)) >> + if (!hard_reg_set_subset_p(mm, temp_set2) || >> + hard_reg_set_subset_p(rr[0], rr[t[cl2]])) >> + t[cl2] = 0; >> + } >> + } >> +} >> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc >> index 693621ca990..1d79c77e8ce 100644 >> --- a/gcc/tree-vect-slp.cc >> +++ b/gcc/tree-vect-slp.cc >> @@ -5198,7 +5198,10 @@ vect_optimize_slp_pass::get_result_with_layout >> (slp_tree node, >> return result; >> >> if (SLP_TREE_DEF_TYPE (node) == vect_constant_def >> - || SLP_TREE_DEF_TYPE (node) == vect_external_def) >> + || (SLP_TREE_DEF_TYPE (node) == vect_external_def >> + && (to_layout_i == 0 >> + /* We can't permute vector defs. */ >> + || SLP_TREE_VEC_DEFS (node).is_empty ()))) > > Guess it's personal preference, but IMO it's easier to follow without the > to_layout_i condition, so that it ties directly to the create_partitions > test. I don’t understand- in the code guarding this we seem to expect to_layout_i == 0 and that’s the case we can handle as noop. I didn’t understand why the function doesn’t always just do nothing in this case though, so I must have missed something. Richard > (Would be nice to have a name for whatever a node matching the new > condition is, but I don't have any good ideas.) > > LGTM otherwise FWIW. > > Thanks, > Richard > >> { >> /* If the vector is uniform or unchanged, there's nothing to do. */ >> if (to_layout_i == 0 || vect_slp_tree_uniform_p (node)) >> @@ -5944,7 +5947,8 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, >> slp_tree node, >> calculated by the recursive call). Otherwise it is the number of >> scalar elements in one scalar iteration (DR_GROUP_SIZE) multiplied by >> VF divided by the number of elements in a vector. */ >> - if (!STMT_VINFO_DATA_REF (stmt_info) >> + if (SLP_TREE_CODE (node) != VEC_PERM_EXPR >> + && !STMT_VINFO_DATA_REF (stmt_info) >> && REDUC_GROUP_FIRST_ELEMENT (stmt_info)) >> { >> for (unsigned i = 0; i < SLP_TREE_CHILDREN (node).length (); ++i)
Re: [PATCH] tree-optimization/110742 - fix latent issue with permuting existing vectors
Richard Biener via Gcc-patches Thu, 20 Jul 2023 07:58:10 -0700
- [PATCH] tree-optimization/110742 - fix l... Richard Biener via Gcc-patches
- Re: [PATCH] tree-optimization/11074... Richard Sandiford via Gcc-patches
- Re: [PATCH] tree-optimization/1... Richard Biener via Gcc-patches
- Re: [PATCH] tree-optimizati... Richard Sandiford via Gcc-patches
- Re: [PATCH] tree-optimi... Richard Biener via Gcc-patches
- Re: [PATCH] tree-o... Richard Sandiford via Gcc-patches
- Re: [PATCH] tr... Richard Biener via Gcc-patches
- Re: [PATCH] tree-optimization/11074... Jeff Law via Gcc-patches