Issue 177753
Summary AMDGPU misses fold of disjoint s_or_b32 to s_addk_i32
Labels good first issue, backend:AMDGPU, missed-optimization
Assignees
Reporter arsenm
    [This code size optimization](https://github.com/llvm/llvm-project/blob/69059c42f7539fd6c41b0a152862ae8741bf8016/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp#L913) to transform scalar adds with a small constant value into s_addk_i32 can be extended to handle s_or_b32, if it has the disjoint flag.

```
; RUN: llc -mtriple=amdgcn-amd-amdpal -mcpu=gfx900 < %s

; Cannot fold without disjoint flag
define amdgpu_ps i32 @s_or_b32_i32(i32 inreg %x) {
  %or = or i32 %x, 257
  ret i32 %or
}

define amdgpu_ps i32 @s_or_b32_disjoint_to_s_addk_i32(i32 inreg %x) {
  %or = or disjoint i32 %x, 257
  ret i32 %or
}
```


_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to