Issue 177532
Summary InstCombine assertion from lshr (zext i32 X), 32 created by EarlyCSE — missing fold in default pipeline
Labels new issue
Assignees
Reporter Jinlock9
    **Description**

I’m hitting an InstCombine assertion when compiling a small, valid C++ program at `-O2` with `LLVM_ENABLE_ASSERTIONS=ON`. The same compilation succeeds when assertions are disabled.

After investigation, the failure is caused by IR of the following form reaching InstCombine:

```
lshr i64 (zext i32 %x to i64), 32
```

Semantically, this _expression_ is always zero, because the upper 32 bits of a `zext i32` are known zero. However, this form survives until InstCombine and triggers an assertion during shift reassociation.

---

**Observed assertion**

```
clang: llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp:1441:
Assertion `ShAmtC < X->getType()->getScalarSizeInBits() &&
"Big shift not simplified to zero?"' failed.
```

This occurs in `InstCombine::visitLShr()`.

---

**IR evolution**

**Before EarlyCSE:**

```
for.body:                                         ; preds = %for.cond
  %mul = shl nuw nsw i32 %j.0, 8
  %add = add nuw nsw i32 %mul, 14336
  %coord.sroa.4.0.insert.ext = zext i32 0 to i64
  %coord.sroa.4.0.insert.shift = shl i64 %coord.sroa.4.0.insert.ext, 32
  %coord.sroa.4.0.insert.mask = and i64 undef, 4294967295
  %coord.sroa.4.0.insert.insert = or i64 %coord.sroa.4.0.insert.mask,
                                           %coord.sroa.4.0.insert.shift
  %coord.sroa.0.0.insert.ext = zext i32 %add to i64
  %coord.sroa.0.0.insert.mask = and i64 %coord.sroa.4.0.insert.insert,
                                      -4294967296
  %coord.sroa.0.0.insert.insert = or i64 %coord.sroa.0.0.insert.mask,
                                           %coord.sroa.0.0.insert.ext
  %coord.sroa.1.0.extract.shift.i =
      lshr i64 %coord.sroa.0.0.insert.insert, 32
  %coord.sroa.1.0.extract.trunc.i =
      trunc nuw i64 %coord.sroa.1.0.extract.shift.i to i32
```

**After EarlyCSE:**

```
for.body:                                         ; preds = %for.cond
  %mul = shl nuw nsw i32 %j.0, 8
  %add = add nuw nsw i32 %mul, 14336
  %coord.sroa.0.0.insert.ext = zext i32 %add to i64
  %coord.sroa.1.0.extract.shift.i =
      lshr i64 %coord.sroa.0.0.insert.ext, 32
  %coord.sroa.1.0.extract.trunc.i =
      trunc nuw i64 %coord.sroa.1.0.extract.shift.i to i32
```

At this point, the `lshr` is trivially zero, but remains in the IR.

---

**What goes wrong in InstCombine**

Later, `InstCombine::visitLShr()` attempts to reassociate the shift:

```
lshr (zext i32 X to i64), 32
  → zext (lshr i32 X, 32)
```

This transformation violates the invariant:

```
ShAmtC < X->getType()->getScalarSizeInBits()
```

and triggers the assertion.

The assertion message itself (“Big shift not simplified to zero?”) suggests that this pattern is expected to have been folded earlier, but the default `-O2` pipeline does not guarantee that.

---

**Default pipeline context**

In the generic `-O2` function simplification pipeline
(`PassBuilder::buildFunctionSimplificationPipeline`):

```cpp
FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));
if (EnableKnowledgeRetention)
  FPM.addPass(AssumeSimplifyPass());

if (EnableGVNHoist)
  FPM.addPass(GVNHoistPass());

if (EnableGVNSink) {
  FPM.addPass(GVNSinkPass());
  FPM.addPass(
      SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
}

FPM.addPass(SpeculativeExecutionPass(/*_OnlyIfDivergentTarget_=*/true));
FPM.addPass(JumpThreadingPass());
FPM.addPass(CorrelatedValuePropagationPass());

FPM.addPass(
    SimplifyCFGPass(SimplifyCFGOptions().convertSwitchRangeToICmp(true)));
FPM.addPass(InstCombinePass());
```

There is no pass between EarlyCSE and InstCombine that guarantees folding of
`lshr (zext i32 X), 32` using known-zero bits.

---

**Minimal C++ reproducer**

```cpp
template <int N>
struct IntArray {
  int data[N];
  int &operator[](int i) { return data[i]; }
  const int &operator[](int i) const { return data[i]; }
};

void process_coord(IntArray<2> coord) {
  int y = coord[1];
  volatile int result = y + 1;
  (void)result;
}

int main() {
  int inter_size = 14336;
  int tile_inter = 256;

  for (int j = 0; j < 2; j++) {
    IntArray<2> coord;
    coord[0] = inter_size + j * tile_inter;
    coord[1] = 0;
    process_coord(coord);
  }

  return 0;
}
```

**Compile command**

```
clang -O2 reproduce.cpp
```

(Reproduces with an LLVM build configured with `LLVM_ENABLE_ASSERTIONS=ON`.)

---

**Analysis**

* EarlyCSE legally exposes `lshr (zext i32 X), 32`
* No pass before InstCombine guarantees folding this using known-bits
* InstCombine assumes this form cannot reach reassociation logic
* This violates an internal invariant and triggers an assertion

This appears to be an InstCombine robustness issue rather than an EarlyCSE bug.

---

**Expected fix direction (may need feedback)**

Handle this defensively in InstCombine, e.g. in `simplifyLShrInst()` or `visitLShr()`:

* Use known-bits information to fold `lshr (zext i32 X), >= 32` to zero
* Otherwise, fall back to existing logic

---

**LLVM version / build**

```
llvm-project commit 10a245bd02749fd7ec757a90b9fda83f51cd138c
LLVM_ENABLE_ASSERTIONS=ON
```

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to