Issue 179008
Summary [AVX-512] Emit `vmovdqu` over `vpexpand` for dense bit sets
Labels new issue
Assignees
Reporter Rexicon226
    Given some IR like:
```llvm
define dso_local <8 x i64> @foo(ptr nocapture nonnull readonly align 1 %0) local_unnamed_addr {
Entry:
  %1 = load <8 x i64>, ptr %0, align 1
  %2 = shufflevector <8 x i64> %1, <8 x i64> <i64 poison, i64 poison, i64 poison, i64 poison, i64 poison, i64 0, i64 0, i64 0>, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 13, i32 14, i32 15>
  ret <8 x i64> %2
}
```

LLVM emits:
```asm
foo:
        mov     al, 31
        kmovd   k1, eax
 vpexpandq       zmm0 {k1} {z}, zmmword ptr [rdi]
 ret
```
https://zig.godbolt.org/z/aMer3W9n7

when it should instead emit the better instruction:
```asm
foo:
        mov     al, 31
        kmovd k1, eax
        vmovdqu64       zmm0 {k1} {z}, zmmword ptr [rdi]
 ret
```
https://c.godbolt.org/z/Ya8EYajTE
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to