reload, and generic should think about it.

venkataramanan.kumar at amd dot com Mon, 21 Aug 2017 23:43:55 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80820


Venkataramanan <venkataramanan.kumar at amd dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |venkataramanan.kumar at amd 
dot co
                   |                            |m

--- Comment #4 from Venkataramanan <venkataramanan.kumar at amd dot com> ---
(In reply to Peter Cordes from comment #0)
> gcc with -mtune=generic likes to bounce through memory when moving data from
> integer registers to xmm for things like _mm_set_epi32.
> 
> There are 3 related tuning issues here:
> 
> * -mtune=haswell -mno-sse4 still uses one store/reload for _mm_set_epi64x.
> 
> * -mtune=znver1 should definitely favour movd/movq instead of store/reload.
>   (Ryzen has 1 m-op movd/movq between vector and integer with 3c latency,
> shorter than store-forwarding.  All the reasons to favour store/reload on
> other AMD uarches are gone.)
> 

Yes for Ryzen, using direct move instructions should be better than using
store-forwarding.

[Bug target/80820] _mm_set_epi64x shouldn't store/reload for -mtune=haswell, Zen should avoid store/reload, and generic should think about it.

Reply via email to