Issue 71078
Summary [X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension
Labels backend:X86
Assignees
Reporter RKSimon
    X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases.
```c
void foo(const int *src, float *dst) {
    for (int i = 0; i != 16; ++i) {
        *dst++ = (float)(*src++ + ((i % 8) + 1));
 }
}
```
llc -mcpu=x86-64-v3
```asm
foo(int const*, float*): # @foo(int const*, float*)
  vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [1,2,3,4,5,6,7,8]
  vpaddd (%rdi), %ymm0, %ymm1
  vcvtdq2ps %ymm1, %ymm1
  vmovups %ymm1, (%rsi)
  vpaddd 32(%rdi), %ymm0, %ymm0
 vcvtdq2ps %ymm0, %ymm0
  vmovups %ymm0, 32(%rsi)
  vzeroupper
 retq
```
We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to