https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target Milestone|--- |14.0 Keywords| |testsuite-fail Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- Mine. We have /* Merge c = VEC_PERM_EXPR <a, b, VCST0>; d = VEC_PERM_EXPR <c, c, VCST1>; to d = VEC_PERM_EXPR <a, b, NEW_VCST>; */ and tree-ssa-forwprop.cc has some other special-cases. What we lack is simplification of two consecutive permutes. > For gimple IRs: > > res_3 = VEC_PERM_EXPR <res_2(D), high_1(D), { 0, 3 }>; > res_5 = VEC_PERM_EXPR <low_4(D), res_3, { 0, 3 }>; > > I'd expect it can be further optimized into > > res_5 = VEC_PERM_EXPR <low_4(D), high_1(D), { 0, 3 }>; where I think the vectors are all vector(2) unsigned long this works because the later permute replaces all elements the first permute uses from the first or second element. Thus the key is to identify whether the inherited elements are all from a single operand of the first source (and which ones).