Bug ID: 77628
           Summary: avx512: unnecessary GR extending after kmovw
           Product: gcc
           Version: 5.3.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot
          Reporter: wojciech.mula at microgen dot com
  Target Milestone: ---

According to the latests documentation from Intel, the kmovw instruction
zeros the higher part of a GP register:

    IF *destination is a memory location*
        DEST[15:0] <- SRC[15:0]
    IF *destination is a mask register or a GPR *
        DEST <- ZeroExtension(SRC[15:0])

GCC adds superfluous movzwl after kmovw:


    #include <stdint.h>
    #include <immintrin.h>

    uint32_t test(__m512i a, __m512i b) {

        uint32_t c = _mm512_cmpeq_epi32_mask(a, b);
        return c;


$ gcc-5 --version
gcc-5 (Debian 5.3.1-13) 5.3.1 20160323
$ gcc-5 -O3 -S -mavx512f report.cpp

Assembly output:

        vpcmpeqd        %zmm1, %zmm0, %k1
        kmovw   %k1, %eax
        movzwl  %ax, %eax <<<< HERE

Reply via email to