https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885
Bug ID: 93885 Summary: Spurious instruction kshiftlw issued Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: guille at berkeley dot edu Target Milestone: --- [gcc version 10.0.1 20200216] The following code ( https://www.godbolt.org/z/JynEy6 ) #include <immintrin.h> __m512 f(__m512 x, __m512 y, __m512 z) { const __mmask16 masky = 0b0010001000100010; const __m512 xy = _mm512_mask_blend_ps(masky, x, y); const __mmask16 maskz = _kshiftli_mask16(masky, 1); return _mm512_mask_blend_ps(maskz, xy, z); } computes 'maskz' by shifting 'masky'. The generated asm: f(float __vector(16), float __vector(16), float __vector(16)): mov eax, 8738 kmovw k1, eax mov eax, 17476 vmovaps zmm0{k1}, zmm1 kmovw k2, eax kshiftlw k1, k1, 1 vmovaps zmm0{k2}, zmm2 ret both loads the value (17476) directly, *and* performs the left-shift 'kshiftlw'. Unless I'm missing something, it would seem one of them (move+kmovw, or kshiftlw) isn't needed. Thanks.