| Issue |
55470
|
| Summary |
[InstCombine] Missed min -> max optimization
|
| Labels |
llvm:codegen,
llvm:instcombine,
missed-optimization
|
| Assignees |
|
| Reporter |
RKSimon
|
This branchless max code turns up occaisionally (e.g. https://github.com/dendibakh/perf-ninja/blob/main/labs/core_bound/vectorization_1/solution.hpp)
```
template <typename T>
inline T max(T a, T b) {
return a - ((a-b) & (a-b)>>31);
}
```
https://godbolt.org/z/snrax36PG
which we only optimize to:
```
define i32 @foo(i32 %x, i32 %y) {
entry:
%sub.i = sub nsw i32 %x, %y
%0 = tail call i32 @llvm.smin.i32(i32 %sub.i, i32 0)
%sub2.i = sub i32 %x, %0
ret i32 %sub2.i
}
declare i32 @llvm.smin.i32(i32, i32)
```
So we're missing a final stage:
```
----------------------------------------
define i8 @src(i8 %x, i8 %y) {
%0:
%diff = sub nsw i8 %x, %y
%smin = smin i8 %diff, 0
%sub = sub i8 %x, %smin
ret i8 %sub
}
=>
define i8 @tgt(i8 %x, i8 %y) {
%0:
%r = smax i8 %x, %y
ret i8 %r
}
Transformation seems to be correct!
```
I think we're missing the inverse and unsigned variants as well.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs