https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78563
Bug ID: 78563 Summary: SSE4.1 pmovzx shuffle pattern not recognized Product: gcc Version: 6.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: linux at carewolf dot com Target Milestone: --- An unpack pattern with 0 constant are neither folded nor recognized as a pmovzx instruction. SSE2 code: _mm_unpacklo_epi32(X, _mm_setzero_si128()) GCC code: __builtin_shuffle((__v4si)X, (__v4si)_mm_setzero_si128(), (__v4si){0, 4, 1, 5}); Will both produce the same result of an xor setting 0 and an unpack instruction, while it could with SSE4.1 emit a pmozx instruction. Note epi32 is just an example here used because it is most compact, this also affects the 8 and 16 bit equivelents. Looking in config/i386/i386.c it seems like there is no code in the expand_vec_perm_* methods for detecting pmovzx patterns.