https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95021

--- Comment #5 from H.J. Lu <hjl.tools at gmail dot com> ---
STV generates:

        8d b6 00 00 00 00       lea    0x0(%esi),%esi
        a1 00 00 00 00          mov    0x0,%eax R_386_32        target_p
        83 ec 08                sub    $0x8,%esp
        f3 0f 7e 00             movq   (%eax),%xmm0
        a1 00 00 00 00          mov    0x0,%eax R_386_32        c
        66 0f 6f c8             movdqa %xmm0,%xmm1
        66 0f 7e 44 24 10       movd   %xmm0,0x10(%esp)
        66 0f 73 d1 20          psrlq  $0x20,%xmm1
        66 0f d6 00             movq   %xmm0,(%eax)
        66 0f 7e 4c 24 14       movd   %xmm1,0x14(%esp)
        ff 74 24 14             pushl  0x14(%esp)
        ff 74 24 14             pushl  0x14(%esp)
        e8 fc ff ff ff          call   <d+0x53> R_386_PC32      e

instead of

        8d b6 00 00 00 00       lea    0x0(%esi),%esi
        a1 00 00 00 00          mov    0x0,%eax R_386_32        target_p
        8b 0d 00 00 00 00       mov    0x0,%ecx R_386_32        c
        83 ec 08                sub    $0x8,%esp
        8b 50 04                mov    0x4(%eax),%edx
        8b 00                   mov    (%eax),%eax
        89 51 04                mov    %edx,0x4(%ecx)
        89 01                   mov    %eax,(%ecx)
        52                      push   %edx
        50                      push   %eax
        e8 fc ff ff ff          call   <d+0x3b> R_386_PC32      e

It is hard to tell if vector is faster.

Reply via email to