https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109153

--- Comment #3 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #2)
> On the GIMPLE side we should canonicalize here I think, at which point
> inserts into a splatted vector become more profitable depends?
> 
>   _4 = VEC_PERM_EXPR <a_2(D), b_3(D), { 0, 8, 1, 9, 2, 10, 3, 11 }>;
>   _5 = VEC_PERM_EXPR <a_2(D), b_3(D), { 4, 12, 5, 13, 6, 14, 7, 15 }>;
>   _6 = {_4, _5};
> 
> we have simplify_vector_constructor in tree-ssa-forwprop.cc.
>

Ah great! 

> For the other BIT_INSERT_EXPR case I'd go to match.pd, but adding a function
> to forwprop is also possible.
> 
> If we want to expand { 4, 4, _1, 4, 4, ..} with splat + insert we should
> IMHO do that at RTL expansion time where we already try splat (I think).
> Not sure how to apply costing there though.  There's also the possibility
> to expand { a, a, b, b, a, b, a, ... } with two splat + blend.  For
> vec_init RTL expansion the target has full control, so it can decide for
> itself (if we do not want to do anything in generic code).

Ok, so the suggestion is to in gimple canonicalize to the simplest vector
constructor form and deal with it in vec_init?  This makes sense, I initially
thought gimple was easier since modifying constructors are simpler in gimple
than RTL.

But it looks like we do all "costing" based pattern checks already in
aarch64_expand_vector_init so as you said, simplifying the vector constructors
should just make it work.

So will go with that and extend aarch64_expand_vector_init if needed.  Thanks!

Reply via email to