Issue 55470
Summary [InstCombine] Missed min -> max optimization
Labels llvm:codegen, llvm:instcombine, missed-optimization
Assignees
Reporter RKSimon
    This branchless max code turns up occaisionally (e.g. https://github.com/dendibakh/perf-ninja/blob/main/labs/core_bound/vectorization_1/solution.hpp)
```
template <typename T>
inline T max(T a, T b) {
	return a - ((a-b) & (a-b)>>31);
}
```
https://godbolt.org/z/snrax36PG

which we only optimize to:

```
define i32 @foo(i32 %x, i32 %y) {
entry:
  %sub.i = sub nsw i32 %x, %y
  %0 = tail call i32 @llvm.smin.i32(i32 %sub.i, i32 0)
  %sub2.i = sub i32 %x, %0
  ret i32 %sub2.i
}
declare i32 @llvm.smin.i32(i32, i32)
```
So we're missing a final stage:
```
----------------------------------------
define i8 @src(i8 %x, i8 %y) {
%0:
  %diff = sub nsw i8 %x, %y
  %smin = smin i8 %diff, 0
  %sub = sub i8 %x, %smin
  ret i8 %sub
}
=>
define i8 @tgt(i8 %x, i8 %y) {
%0:
  %r = smax i8 %x, %y
  ret i8 %r
}
Transformation seems to be correct!
```

I think we're missing the inverse and unsigned variants as well.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to