| Issue |
71078
|
| Summary |
[X86] X86FixupVectorConstantsPass - add support for VPMOVSX/VPMOVZX constant extension
|
| Labels |
backend:X86
|
| Assignees |
|
| Reporter |
RKSimon
|
X86FixupVectorConstantsPass currently just handles folding full vector loads to broadcasts, but we're missing the opportunity to use sign/zero extension loads for non-uniform cases.
```c
void foo(const int *src, float *dst) {
for (int i = 0; i != 16; ++i) {
*dst++ = (float)(*src++ + ((i % 8) + 1));
}
}
```
llc -mcpu=x86-64-v3
```asm
foo(int const*, float*): # @foo(int const*, float*)
vmovdqa .LCPI0_0(%rip), %ymm0 # ymm0 = [1,2,3,4,5,6,7,8]
vpaddd (%rdi), %ymm0, %ymm1
vcvtdq2ps %ymm1, %ymm1
vmovups %ymm1, (%rsi)
vpaddd 32(%rdi), %ymm0, %ymm0
vcvtdq2ps %ymm0, %ymm0
vmovups %ymm0, 32(%rsi)
vzeroupper
retq
```
We can reduce the size of the constant pool entry by replacing the vmovdqa load with vpmovzxbd
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs