| Issue |
177753
|
| Summary |
AMDGPU misses fold of disjoint s_or_b32 to s_addk_i32
|
| Labels |
good first issue,
backend:AMDGPU,
missed-optimization
|
| Assignees |
|
| Reporter |
arsenm
|
[This code size optimization](https://github.com/llvm/llvm-project/blob/69059c42f7539fd6c41b0a152862ae8741bf8016/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp#L913) to transform scalar adds with a small constant value into s_addk_i32 can be extended to handle s_or_b32, if it has the disjoint flag.
```
; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx900 < %s
; Cannot fold without disjoint flag
define amdgpu_ps i32 @s_or_b32_i32(i32 inreg %x) {
%or = or i32 %x, 257
ret i32 %or
}
define amdgpu_ps i32 @s_or_b32_disjoint_to_s_addk_i32(i32 inreg %x) {
%or = or disjoint i32 %x, 257
ret i32 %or
}
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs