Issue |
137520
|
Summary |
[AVX-512] `vpternlog` can be used even when a subexpression is consumed
|
Labels |
new issue
|
Assignees |
|
Reporter |
Validark
|
This code:
```zig
export fn foo(a: @Vector(8, u64), b: @Vector(8, u64), c: @Vector(8, u64)) @Vector(8, u64) {
const x = a & b;
const y = a & b & c;
return x *% y;
}
```
```llvm
define dso_local <8 x i64> @foo(<8 x i64> %0, <8 x i64> %1, <8 x i64> %2) local_unnamed_addr {
Entry:
%3 = and <8 x i64> %1, %0
%4 = and <8 x i64> %3, %2
%5 = mul <8 x i64> %4, %3
ret <8 x i64> %5
}
```
Compiles to:
```asm
vpandq zmm0, zmm1, zmm0
vpandq zmm1, zmm0, zmm2
vpmullq zmm0, zmm1, zmm0
```
Should be:
```asm
vpandq zmm3, zmm1, zmm0
vpternlogq zmm2, zmm1, zmm0, 128
vpmullq zmm0, zmm2, zmm3
```
In the current assembly, the second `vpandq` relies on the input of the first one. `vpternlogq`, on the other hand, can be computed in parallel to `vpandq`.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs