https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111228

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |14.0
           Keywords|                            |testsuite-fail
             Status|NEW                         |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |rguenth at gcc dot 
gnu.org

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Mine.  We have

/* Merge
   c = VEC_PERM_EXPR <a, b, VCST0>;
   d = VEC_PERM_EXPR <c, c, VCST1>;
   to
   d = VEC_PERM_EXPR <a, b, NEW_VCST>;  */

and tree-ssa-forwprop.cc has some other special-cases.  What we lack is
simplification of two consecutive permutes.

> For gimple IRs:
> 
>  res_3 = VEC_PERM_EXPR <res_2(D), high_1(D), { 0, 3 }>;
>  res_5 = VEC_PERM_EXPR <low_4(D), res_3, { 0, 3 }>;
>
> I'd expect it can be further optimized into
>
>  res_5 = VEC_PERM_EXPR <low_4(D), high_1(D), { 0, 3 }>;

where I think the vectors are all vector(2) unsigned long this works
because the later permute replaces all elements the first permute
uses from the first or second element.  Thus the key is to identify
whether the inherited elements are all from a single operand of the
first source (and which ones).

Reply via email to