https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88808
Bug ID: 88808 Summary: bitwise operators on AVX512 masks fail to use the new mask instructions Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: kretz at kde dot org Target Milestone: --- Target: x86_64-*-*, i?86-*-* Test case (https://godbolt.org/z/gyCN12): #include <x86intrin.h> using V [[gnu::vector_size(16)]] = float; auto f(V x) { auto mask = _mm_fpclass_ps_mask(x, 16) | _mm_fpclass_ps_mask(x, 8); return _mm_mask_blend_ps(mask, x, x + 1); } auto g(V x) { auto mask = _kor_mask8(_mm_fpclass_ps_mask(x, 16), _mm_fpclass_ps_mask(x, 8)); return _mm_mask_blend_ps(mask, x, x + 1); } Function f should compile to the same code as g does, i.e. use korb instead of kmovb + orl + kmovb. Similar test cases can be constructed for kxor, kand, and kandn as well as for masks of 8 and 16 bits (likely for 32 and 64 as well, but I have not tested it). For kand it's a bit trickier to trigger, but e.g. the following shows it: __mmask8 foo = 0; auto f(V x) { auto mask0 = _mm_fpclass_ps_mask(x, 16); auto mask1 = _mm_fpclass_ps_mask(x, 8); foo = mask0 | mask1; return _mm_mask_blend_ps(mask0 & mask1, x, x + 1); }