https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88461
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |missed-optimization Target|X86_64 |x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed| |2018-12-12 Ever confirmed|0 |1 --- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> --- <bb 2> [local count: 1073741824]: _13 = MEM[(const __m128i * {ref-all})data_3(D)]; _11 = VIEW_CONVERT_EXPR<vector(8) short int>(_13); _12 = __builtin_ia32_ptestnmw128 (_11, _11, 255); _1 = (int) _12; _10 = __builtin_ia32_kshiftlihi (_1, 1); _14 = a_6(D) & 65535; _5 = _10 & 255; _4 = (int) _5; _9 = __builtin_ia32_kandnhi (_4, _14); m_7 = (__mmask8) _9; _8 = (int) m_7; return _8; probably an artifact of C promoting __mmask8 to int: __m128i v = _mm_load_si128 ((const __m128i * {ref-all}) data); __mmask8 m = _mm_testn_epi16_mask (v, v); __m128i v = _mm_load_si128 ((const __m128i * {ref-all}) data); __mmask8 m = _mm_testn_epi16_mask (v, v); m = (__mmask8) _kshiftli_mask16 ((int) m, 1); m = (__mmask8) _mm512_kandn ((int) m, (int) (__mmask16) a); return (int) m; and ;; Function _kshiftli_mask16 (null) ;; enabled by -tree-original { return (__mmask16) __builtin_ia32_kshiftlihi ((int) __A, (int) (unsigned char) __B); btw, why are you using mask16 intrinsics on mask8 types? When using kshiftli_mask8 and kandn_mask8 I get vmovdqa64 (%rdi), %xmm0 kmovb %esi, %k3 vptestnmw %xmm0, %xmm0, %k1 kshiftlb $1, %k1, %k0 kandnb %k3, %k0, %k2 kmovb %k2, %eax