This is a friendly reminder there's still no way to enjoy pextrw without undue zero/sign extension unless inline asm is used; there's even a gradient of ignominy from intrinsic to builtins, as exemplified by: $ cat pextrw.cc #include <smmintrin.h> long unsigned int foo1(__m128i x) { return _mm_extract_epi16(x, 3); } long unsigned int foo2(__v8hi x) { return __builtin_ia32_vec_ext_v8hi((__v8hi) x, 3); } int main() { return 0; } $ /usr/local/gcc-4.6-20100811/bin/g++ -O3 -march=native pextrw.cc 00000000004004a0 <_Z4foo1Dv2_x>: 4004a0: 66 0f c5 c0 03 pextrw $0x3,%xmm0,%eax 4004a5: 98 cwtl 4004a6: 48 98 cltq 4004a8: c3 retq
00000000004004b0 <_Z4foo2Dv8_s>: 4004b0: 66 0f c5 c0 03 pextrw $0x3,%xmm0,%eax 4004b5: 48 0f bf c0 movswq %ax,%rax 4004b9: c3 retq That's on x86-64, on a Intel I7 which, incidentally, is much faster at that whole pextrw business than previous generations. This report may or may not be construed as a duplicate of the long forgotten PR 41323. -- Summary: pextrw, redundant zero (or otherwise) extension Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: tbptbp at gmail dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=45294