https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92822

--- Comment #6 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to rsand...@gcc.gnu.org from comment #5)
> I think this is mostly a target problem.  We weren't providing
> patterns to extract a 64-bit vector from a 128-bit vector,
> despite that being very much a native operation.
> 
> Adding those fixes most of the problems.  What's left is that:
> 
>   v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, ?, ? }>;
>   ...extract first half of v4sf_b...
> 
> gets filled out as:
> 
>   v4sf_b = VEC_PERM_EXPR <v4sf_a, v4sf_a, { 1, 1, 2, 3 }>;
>   ...extract first half of v4sf_b...
> 
> and we never recover from the awkwardness of that permute.
> The easiest fix seems to be to extend a partial duplicate
> to a full duplicate instead of a partial duplicate followed
> by a partial blend.

The ugliness is that all this "heuristics" need to be in the
permutation detection code since we can't encode a "we do not care"
value in the permute mask.  A way out (with more GIMPLE stmts) would
have been to avoid the "do not care" elements.  In this particular
case we only need the lower part anyway...

I'm going to try some heuristic tweaking for PR92819 (not this week
but maybe next ...), the simplify_vector_constructor function is already
quite complicated and providing testsuite coverage for all paths is hard...

Reply via email to