https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80833
--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> --- The patch from comment #7 generates: a) DImode move for 32 bit targets: --cut here-- long long test (long long a) { asm ("" : "+x" (a)); return a; } --cut here-- gcc -O2 -msse4.1 -mtune=intel -mregparm=2: movd %eax, %xmm0 pinsrd $1, %edx, %xmm0 movq %xmm0, (%esp) <<-- unneeded store due to RA problem movd %xmm0, %eax pextrd $1, %xmm0, %edx leal 12(%esp), %esp b) TImode move for 64 bit targets: --cut here-- __int128 test (__int128 a) { asm ("" : "+x" (a)); return a; } --cut here-- gcc -O2 -msse4.1 -mtune=intel movq %rdi, %xmm0 pinsrq $1, %rsi, %xmm0 pextrq $1, %xmm0, %rdx movq %xmm0, %rax