| Issue |
168880
|
| Summary |
[InstCombine] Preserve nsw constraint via @llvm.assume when transforming umin + sub nsw to usub.sat to enable further folding
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
ParkHanbum
|
Description
Currently, InstCombine transforms a pattern involving umin and sub nsw into @llvm.usub.sat. While this transformation is a valid refinement (since sub nsw produces poison on overflow, whereas usub.sat produces a defined value), it inadvertently discards the range information implied by the nsw flag.
Losing this constraint prevents subsequent optimizations from folding the code into a more efficient form (e.g., icmp slt checks).
Current Behavior
```
define i1 @src(i8 noundef %x, i8 noundef %y, i8 %z) {
%v0 = tail call i8 @llvm.umin.i8(i8 %x, i8 %y)
; The 'nsw' flag implies that %x is within a range that does not overflow/wrap.
%v1 = sub nsw i8 %x, %v0
%v2 = icmp slt i8 %v1, %z
ret i1 %v2
}
; Transformed to:
define i1 @tgt(i8 noundef %x, i8 noundef %y, i8 %z) {
; Constraint info is lost here. usub.sat is well-defined for all inputs.
%sub = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y)
%cmp = icmp slt i8 %sub, %z
ret i1 %cmp
}
```
The Problem
The target IR (@tgt) cannot be further optimized into the ideal form (@final_tgt) because usub.sat handles inputs differently than the original sub nsw when value ranges conflict (e.g., when %x is negative in a signed context but large in an unsigned context).
Without the explicit knowledge that "inputs causing signed wrap are impossible" (which was originally provided by nsw), the optimizer cannot safely transform the usub.sat + icmp pattern into a simple add + icmp.
Desired Optimization
If we preserve the constraint, we should be able to reach this optimal form:
code Llvm
```
define i1 @final_tgt(i8 noundef %x, i8 noundef %y, i8 %z) {
%sum = add i8 %y, %z
%cmp = icmp slt i8 %x, %sum
ret i1 %cmp
}
```
Proposed Solution
When performing the transform from @src to @tgt, we should materialize the implicit nsw constraint into an explicit @llvm.assume. This preserves the range information, allowing subsequent passes (like ValueTracking/InstCombine) to prove the safety of transforming to @final_tgt.
Intermediate IR (Proposed Step):
```
define i1 @tgt_with_assume(i8 noundef %x, i8 noundef %y, i8 %z) {
; Explicitly preserve the constraint implied by 'sub nsw'
; (e.g., x >= 0 or range check that prevents signed wrap)
%x_pos = icmp sge i8 %x, 0
call void @llvm.assume(i1 %x_pos)
%sub = call i8 @llvm.usub.sat.i8(i8 %x, i8 %y)
%cmp = icmp slt i8 %sub, %z
ret i1 %cmp
}
```
With the assume intrinsic present, the optimizer can deduce that %x is in a safe range (e.g., non-negative), making the transformation to @final_tgt valid. Since @llvm.assume is removed in the backend, this incurs no runtime overhead while enabling better code generation.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs