Issue 97694
Summary [AVX512] Replace x86_avx512_mask_pmov_* register intrinsics with generic expansion
Labels backend:X86
Assignees
Reporter RKSimon
    We already expand basic truncation intrinsics to the trunc+shuffle sequence:
```cpp
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_cvtepi64_epi8 (__m128i __A)
{
  return (__m128i)__builtin_shufflevector(
 __builtin_convertvector((__v2di)__A, __v2qi), (__v2qi){0, 0}, 0, 1, 2, 3,
 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3);
}
```
But the predicate variants are not, as its tricky to correctly merge the predicate select of the lower elements and the zeroing of the upper elements:
```
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_mask_cvtepi64_epi8 (__m128i __O, __mmask8 __M, __m128i __A)
{
  return (__m128i) __builtin_ia32_pmovqb128_mask ((__v2di) __A,
              (__v16qi) __O, __M);
}

static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_maskz_cvtepi64_epi8 (__mmask8 __M, __m128i __A)
{
  return (__m128i) __builtin_ia32_pmovqb128_mask ((__v2di) __A,
              (__v16qi) _mm_setzero_si128 (),
 __M);
}
```
So this is likely to require some improvements to the DAG backend as well as clang frontend
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to