[llvm-bugs] [Bug 164399] [X86] Poor AVX512 codegen with constant predicate

LLVM Bugs via llvm-bugs Tue, 21 Oct 2025 04:27:09 -0700

Issue	164399
Summary	[X86] Poor AVX512 codegen with constant predicate
Labels	backend:X86, missed-optimization
Assignees
Reporter	RKSimon

    Noticed while reviewing constexpr handling of the predicated arithmetic:
```ll
define <16 x i32> @add(<16 x i32> %x, <16 x i32> %y) {
 %add = add <16 x i32> %y, %x
  %res = shufflevector <16 x i32> %add, <16 x i32> zeroinitializer, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
  ret <16 x i32> %res
}
```
```asm
add: # @add
  vpaddd %zmm0, %zmm1, %zmm0
  movw $255, %ax
  kmovd %eax, %k1
  vpexpandd %zmm0, %zmm0 {%k1} {z}
  retq
```
Lots of things going wrong here:
1. Lowering the shuffle as an expansion instead of a select (which would fold into a predicated instruction)
2. Use of movw/kmovd instead of kxnorb to rematerialize the 0xFF predicate mask directly
3. Zeroing upper 256-bits of the vector - so this could have just been done as `vpaddd %ymm0, %ymm1, %ymm0` for implicit zeroing

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 164399] [X86] Poor AVX512 codegen with constant predicate

Reply via email to