Issue 83402
Summary [X86] v4f16 -> v4i32 conversion unnecessarily using YMM registers
Labels backend:X86, missed-optimization
Assignees
Reporter RKSimon
    https://rust.godbolt.org/z/PEfxzYY4z

```ll
define <4 x i32> @fptosi_4f16_to_4i32(<4 x half> %a) nounwind {
  %cvt = fptosi <4 x half> %a to <4 x i32>
  ret <4 x i32> %cvt
}
```
llc -mcpu=x86-64-v3
```asm
fptosi_4f16_to_4i32:                    # @fptosi_4f16_to_4i32
        vcvtph2ps       %xmm0, %ymm0
 vcvttps2dq      %ymm0, %ymm0
        vzeroupper
 retq
```

We should only require the xmm variants (and avoid the vzeroupper entirely)
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to