| Issue |
177532
|
| Summary |
InstCombine assertion from lshr (zext i32 X), 32 created by EarlyCSE — missing fold in default pipeline
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
Jinlock9
|
**Description**
I’m hitting an InstCombine assertion when compiling a small, valid C++ program at `-O2` with `LLVM_ENABLE_ASSERTIONS=ON`. The same compilation succeeds when assertions are disabled.
After investigation, the failure is caused by IR of the following form reaching InstCombine:
```
lshr i64 (zext i32 %x to i64), 32
```
Semantically, this _expression_ is always zero, because the upper 32 bits of a `zext i32` are known zero. However, this form survives until InstCombine and triggers an assertion during shift reassociation.
---
**Observed assertion**
```
clang: llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp:1441:
Assertion `ShAmtC < X->getType()->getScalarSizeInBits() &&
"Big shift not simplified to zero?"' failed.
```
This occurs in `InstCombine::visitLShr()`.
---
**IR evolution**
**Before EarlyCSE:**
```
for.body: ; preds = %for.cond
%mul = shl nuw nsw i32 %j.0, 8
%add = add nuw nsw i32 %mul, 14336
%coord.sroa.4.0.insert.ext = zext i32 0 to i64
%coord.sroa.4.0.insert.shift = shl i64 %coord.sroa.4.0.insert.ext, 32
%coord.sroa.4.0.insert.mask = and i64 undef, 4294967295
%coord.sroa.4.0.insert.insert = or i64 %coord.sroa.4.0.insert.mask,
%coord.sroa.4.0.insert.shift
%coord.sroa.0.0.insert.ext = zext i32 %add to i64
%coord.sroa.0.0.insert.mask = and i64 %coord.sroa.4.0.insert.insert,
-4294967296
%coord.sroa.0.0.insert.insert = or i64 %coord.sroa.0.0.insert.mask,
%coord.sroa.0.0.insert.ext
%coord.sroa.1.0.extract.shift.i =
lshr i64 %coord.sroa.0.0.insert.insert, 32
%coord.sroa.1.0.extract.trunc.i =
trunc nuw i64 %coord.sroa.1.0.extract.shift.i to i32
```
**After EarlyCSE:**
```
for.body: ; preds = %for.cond
%mul = shl nuw nsw i32 %j.0, 8
%add = add nuw nsw i32 %mul, 14336
%coord.sroa.0.0.insert.ext = zext i32 %add to i64
%coord.sroa.1.0.extract.shift.i =
lshr i64 %coord.sroa.0.0.insert.ext, 32
%coord.sroa.1.0.extract.trunc.i =
trunc nuw i64 %coord.sroa.1.0.extract.shift.i to i32
```
At this point, the `lshr` is trivially zero, but remains in the IR.
---
**What goes wrong in InstCombine**
Later, `InstCombine::visitLShr()` attempts to reassociate the shift:
```
lshr (zext i32 X to i64), 32
→ zext (lshr i32 X, 32)
```
This transformation violates the invariant:
```
ShAmtC < X->getType()->getScalarSizeInBits()
```
and triggers the assertion.
The assertion message itself (“Big shift not simplified to zero?”) suggests that this pattern is expected to have been folded earlier, but the default `-O2` pipeline does not guarantee that.
---
**Default pipeline context**
In the generic `-O2` function simplification pipeline
(`PassBuilder::buildFunctionSimplificationPipeline`):
```cpp
FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));
if (EnableKnowledgeRetention)
FPM.addPass(AssumeSimplifyPass());
if (EnableGVNHoist)
FPM.addPass(GVNHoistPass());
if (EnableGVNSink) {
FPM.addPass(GVNSinkPass());
FPM.addPass(
SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
}
FPM.addPass(SpeculativeExecutionPass(/*_OnlyIfDivergentTarget_=*/true));
FPM.addPass(JumpThreadingPass());
FPM.addPass(CorrelatedValuePropagationPass());
FPM.addPass(
SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
FPM.addPass(InstCombinePass());
```
There is no pass between EarlyCSE and InstCombine that guarantees folding of
`lshr (zext i32 X), 32` using known-zero bits.
---
**Minimal C++ reproducer**
```cpp
template <int N>
struct IntArray {
int data[N];
int &operator[](int i) { return data[i]; }
const int &operator[](int i) const { return data[i]; }
};
void process_coord(IntArray<2> coord) {
int y = coord[1];
volatile int result = y + 1;
(void)result;
}
int main() {
int inter_size = 14336;
int tile_inter = 256;
for (int j = 0; j < 2; j++) {
IntArray<2> coord;
coord[0] = inter_size + j * tile_inter;
coord[1] = 0;
process_coord(coord);
}
return 0;
}
```
**Compile command**
```
clang -O2 reproduce.cpp
```
(Reproduces with an LLVM build configured with `LLVM_ENABLE_ASSERTIONS=ON`.)
---
**Analysis**
* EarlyCSE legally exposes `lshr (zext i32 X), 32`
* No pass before InstCombine guarantees folding this using known-bits
* InstCombine assumes this form cannot reach reassociation logic
* This violates an internal invariant and triggers an assertion
This appears to be an InstCombine robustness issue rather than an EarlyCSE bug.
---
**Expected fix direction (may need feedback)**
Handle this defensively in InstCombine, e.g. in `simplifyLShrInst()` or `visitLShr()`:
* Use known-bits information to fold `lshr (zext i32 X), >= 32` to zero
* Otherwise, fall back to existing logic
---
**LLVM version / build**
```
llvm-project commit 10a245bd02749fd7ec757a90b9fda83f51cd138c
LLVM_ENABLE_ASSERTIONS=ON
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs