Issue 178805
Summary [TailRecElim] TRE fails to optimize `shl` accumulators (phase ordering issue with `InstCombine`)
Labels new issue
Assignees
Reporter FedericoBruzzone
    While LLVM can optimize recursive functions with accumulators in the presence of commutative or associative instructions, such as ([compiler explorer -O1](https://godbolt.org/z/sPT1TMc6Y)):
```c
int f(int x) {
    if (x == 1) return 1;
    return f(x-1) * 3;
}
```

The Tail Recursion Elimination (TRE) pass fails to transform recursive functions into loops when the accumulator operation is a left shift (`shl`). This is particularly problematic because `InstCombine` often strength-reduces multiplications into `shl`, effectively blocking TRE from optimizing the function: a phase-ordering problem.
Even with -O3, the following function remains recursive, whereas GCC successfully transforms it ([compiler explorer -O3](https://godbolt.org/z/ee4fM597K)):
```c
int f(int x) {
    if (x == 1) return 1;
    return f(x-1) * 2; // --> %mul = shl nsw i32 %call, 1
}
```
While `shl` is neither associative nor commutative, a chain of shifts by a constant amount C is equivalent to a single shift by the sum of the amounts: $$(\dots (Base \ll C) \ll C) \dots \ll C = Base \ll (C \times Iterations)$$ (say a pseudo-associative relation $$(Base \ll C) \ll C  = Base \ll (C + C)$$?)
This is subject to overflow, but the transformation is still valid: signed integer overflow is UB in C and it is lowered to `shl nsw`, for instance.

**Improvements**
TRE should be extended to support `shl` (and potentially `lshr`/`ashr`) when the shift amount is an invariant constant. The transformation can initialize the accumulator with the base case value (more base cases could be handled, as in the link above) and then apply the current logic as is.
Of course, this fix would improve the robustness of TRE against common canonicalizations performed by, for instance, `InstCombine`, but it requires a little bit of compilation time.

Am I missing something? Is there a fundamental reason to limit TRE to associative/commutative instructions?

https://github.com/llvm/llvm-project/blob/05e908609227e1e8d993659e604a63668dfd2825/llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp#L377-L379
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to