https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109944

Alexander Monakov <amonakov at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amonakov at gcc dot gnu.org

--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #3)
> so we're building SImode elements in %xmm regs and then
> unpack them - that's probably better than a series of
> pinsrw due to dependences.  For uarchs where grp->xmm
> moves are costly it might be better to do
> 
>   pxor %xmm0, %xmm0
>   pinsrw $0, (%rsi), %xmm0
>   pinsrw $1, 32(%rsi), %xmm0
> 
> though?

I'm afraid that is impossible, pinsrw will attempt to load 2 bytes, but only 1
is accessible (if at end of page).

Reply via email to