https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93885

            Bug ID: 93885
           Summary: Spurious instruction kshiftlw issued
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: guille at berkeley dot edu
  Target Milestone: ---

[gcc version 10.0.1 20200216]

The following code ( https://www.godbolt.org/z/JynEy6 )
#include <immintrin.h>
__m512 f(__m512 x, __m512 y, __m512 z)
{
    const __mmask16 masky = 0b0010001000100010;
    const __m512 xy = _mm512_mask_blend_ps(masky, x, y);
    const __mmask16 maskz = _kshiftli_mask16(masky, 1);
    return _mm512_mask_blend_ps(maskz, xy, z);
}

computes 'maskz' by shifting 'masky'. The generated asm:

f(float __vector(16), float __vector(16), float __vector(16)):
        mov     eax, 8738
        kmovw   k1, eax
        mov     eax, 17476
        vmovaps zmm0{k1}, zmm1
        kmovw   k2, eax
        kshiftlw        k1, k1, 1
        vmovaps zmm0{k2}, zmm2
        ret

both loads the value (17476) directly, *and* performs the left-shift
'kshiftlw'. 
Unless I'm missing something, it would seem one of them (move+kmovw, or
kshiftlw) isn't needed. 

Thanks.

Reply via email to