https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83203

--- Comment #4 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #3)

> Now, because this is generic tuning we force that into stack.
> Though I must repeat for the nth time that this is very confusing; either
> for some AMD chips (is it really that bad in contemporary ones) vmovd is way
> too expensive, but then either vpinsrq is also too expensive (in that case
> we should be happy we emit what we do now on the trunk; but then
> <sse2p4_1>_pinsr<ssemodesuffix> should use Yi instead of x or v in
> alternatives with r input; and similarly use Yi in vec_concatv2di in the
> vpinsrq and pinsrq alternatives), or vmovd is expensive, but vpinsrq is not,
> then we just should use vpinsrq for the vec_concatv2di pattern,
> (i.e. add alternative for =x,r,C which will split into clearing the
> destination plus vpinsrq).

AFAICT, pinsr is expensive either with memory or with register operand. Some
time ago, the idea of mine was to implement missing direct SImode and DImode
moves for AMD targets with "pinsr $0, ..." and "pextr $0, ...", but the idea
was scrapped since these insns were worse than moving the value through memory.

> Another thing is that with -O2 -mavx2 -mtune=intel we emit:
>       vmovq   %rdi, %xmm0
>       vmovdqa %xmm0, %xmm0
>         ret
> when we could just emit
>         vmovq   %rdi, %xmm0
> I think.  I guess we'd need a pattern for combine that would match what
> combiner's trying:
> (set (reg:V4DI 90)
>     (vec_concat:V4DI (vec_concat:V2DI (reg/v:DI 88 [ x ])
>             (const_int 0 [0]))
>         (const_vector:V2DI [
>                 (const_int 0 [0])
>                 (const_int 0 [0])
>             ])))
> and perhaps simplify that into something different - vec_select from all
> zeros and vec_duplicate, so that we don't need to list all weird cases?
> Though perhaps the r254548 change goes here in the wrong direction.

Maybe we can handle these in the middle end in some generic way, especially
when combination simplifies to a simple move.

Reply via email to