https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88808

            Bug ID: 88808
           Summary: bitwise operators on AVX512 masks fail to use the new
                    mask instructions
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: kretz at kde dot org
  Target Milestone: ---
            Target: x86_64-*-*, i?86-*-*

Test case (https://godbolt.org/z/gyCN12):
#include <x86intrin.h>

using V [[gnu::vector_size(16)]] = float;

auto f(V x) {
    auto mask = _mm_fpclass_ps_mask(x, 16) | _mm_fpclass_ps_mask(x, 8);
    return _mm_mask_blend_ps(mask, x, x + 1);
}

auto g(V x) {
    auto mask = _kor_mask8(_mm_fpclass_ps_mask(x, 16), _mm_fpclass_ps_mask(x,
8));
    return _mm_mask_blend_ps(mask, x, x + 1);
}

Function f should compile to the same code as g does, i.e. use korb instead of
kmovb + orl + kmovb. Similar test cases can be constructed for kxor, kand, and
kandn as well as for masks of 8 and 16 bits (likely for 32 and 64 as well, but
I have not tested it). For kand it's a bit trickier to trigger, but e.g. the
following shows it:

__mmask8 foo = 0;
auto f(V x) {
    auto mask0 = _mm_fpclass_ps_mask(x, 16);
    auto mask1 = _mm_fpclass_ps_mask(x, 8);
    foo = mask0 | mask1;
    return _mm_mask_blend_ps(mask0 & mask1, x, x + 1);
}

Reply via email to