https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92822
rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rsandifo at gcc dot gnu.org Assignee|rguenth at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #5 from rsandifo at gcc dot gnu.org <rsandifo at gcc dot gnu.org> --- I think this is mostly a target problem. We weren't providing patterns to extract a 64-bit vector from a 128-bit vector, despite that being very much a native operation. Adding those fixes most of the problems. What's left is that: v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, ?, ? }>; ...extract first half of v4sf_b... gets filled out as: v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, 2, 3 }>; ...extract first half of v4sf_b... and we never recover from the awkwardness of that permute. The easiest fix seems to be to extend a partial duplicate to a full duplicate instead of a partial duplicate followed by a partial blend.