[llvm-bugs] [Bug 97694] [AVX512] Replace x86_avx512_mask_pmov_* register intrinsics with generic expansion

LLVM Bugs via llvm-bugs Thu, 04 Jul 2024 01:31:55 -0700

Issue	97694
Summary	[AVX512] Replace x86_avx512_mask_pmov_* register intrinsics with generic expansion
Labels	backend:X86
Assignees
Reporter	RKSimon

    We already expand basic truncation intrinsics to the trunc+shuffle sequence:
```cpp
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_cvtepi64_epi8 (__m128i __A)
{
  return (__m128i)__builtin_shufflevector(
 __builtin_convertvector((__v2di)__A, __v2qi), (__v2qi){0, 0}, 0, 1, 2, 3,
 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3);
}
```
But the predicate variants are not, as its tricky to correctly merge the predicate select of the lower elements and the zeroing of the upper elements:
```
static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_mask_cvtepi64_epi8 (__m128i __O, __mmask8 __M, __m128i __A)
{
  return (__m128i) __builtin_ia32_pmovqb128_mask ((__v2di) __A,
              (__v16qi) __O, __M);
}


static __inline__ __m128i __DEFAULT_FN_ATTRS128
_mm_maskz_cvtepi64_epi8 (__mmask8 __M, __m128i __A)
{
  return (__m128i) __builtin_ia32_pmovqb128_mask ((__v2di) __A,
              (__v16qi) _mm_setzero_si128 (),
 __M);
}
```
So this is likely to require some improvements to the DAG backend as well as clang frontend

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 97694] [AVX512] Replace x86_avx512_mask_pmov_* register intrinsics with generic expansion

Reply via email to