| Issue |
185271
|
| Summary |
[InstCombine] Incorrectly scales fixed-result `vector.extract` index when folding `get_active_lane_mask`
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
cardigan1008
|
Here is a miscompilation case when reviewing https://github.com/llvm/llvm-project/pull/183329:
```llvm
define <4 x i1> @bail_lhs_is_zero() vscale_range(4, 4) {
%mask = call <vscale x 4 x i1> @llvm.get.active.lane.mask.nxv4i1.i32(i32 0, i32 15)
%ext = call <4 x i1> @llvm.vector.extract.v4i1.nxv4i1(<vscale x 4 x i1> %mask, i64 4)
ret <4 x i1> %ext
}
declare <vscale x 4 x i1> @llvm.get.active.lane.mask.nxv4i1.i32(i32, i32)
declare <4 x i1> @llvm.vector.extract.v4i1.nxv4i1(<vscale x 4 x i1>, i64)
```
With patch built on this patch, it's transformed into:
```llvm
define <4 x i1> @bail_lhs_is_zero() #0 {
ret <4 x i1> zeroinitializer
}
declare <vscale x 4 x i1> @llvm.get.active.lane.mask.nxv4i1.i32(i32, i32) #1
declare <4 x i1> @llvm.vector.extract.v4i1.nxv4i1(<vscale x 4 x i1>, i64 immarg) #1
attributes #0 = { vscale_range(4,4) }
attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
```
Compiler Explorer: https://godbolt.org/z/1sEGW8jhz
> Note: This is a review assisted with a self-built agent. The reproducer was validated manually. Please let me know if anything is wrong.
**Bug Triggering Analysis:**
The test case triggers the bug because it extracts a fixed-width subvector from the result of `get_active_lane_mask`, while the function has `vscale_range(4, 4)`. For a fixed-width vector.extract, the extract index is interpreted as a plain element index, so this extract starts at lane 4 and returns lanes 4–7. For `get_active_lane_mask(0, 15)`, those lanes are still active, so the extracted result is not all-false. However, the transform incorrectly folds it to zeroinitializer, changing the program’s meaning.
```
full mask: [T T T T T T T T T T T T T T T F]
extract @ 4: [T T T T]
wrong fold: [F F F F]
```
**Fix Weakness Analysis:**
The weakness in the fix is that it uses `ExtractIdx * VScaleMin` to decide whether all extracted lanes are inactive. That reasoning is only valid when the extract result is itself scalable. In this test case, the result is fixed-width, so the starting lane is just `ExtractIdx`, not `ExtractIdx * vscale`. As a result, the fix applies scalable-index reasoning to a fixed-result extract and can incorrectly conclude that the extracted subvector is entirely inactive, leading to a wrong-code fold to zero.
cc @kmclaughlin-arm
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs