Issue 171207
Summary [LoopFusion] Simplifying the legality checks
Labels loopoptim, loop-fusion
Assignees 1997alireza
Reporter 1997alireza
    The current loop-fusion pass contains redundant and unnecessary checks that are causing multiple issues and crashes (e.g., #166560, #166535, #165031, #80301, #166356). In addition, a required domination check is missing (#168263). Considering that the current loop fusion only supports adjacent loops, we are able to simplify the checks in this pass. Following plans outline how these loop fusion checks should be restructured. There are two approaches we have considered and tested:

### Approach 1: Simplify **[isControlFlowEquivalent](https://github.com/llvm/llvm-project/blob/a38f84718713b5c6fedb1ca0f2e800c2f1851fa1/llvm/lib/Transforms/Utils/CodeMoverUtils.cpp#L237)**
This function is currently unreliable and unnecessary. Its existing uses can be replaced or eliminated:

  1. **During candidate collection ([collectFusionCandidates](https://github.com/llvm/llvm-project/blob/de53b1a4ef4cd7c641b3ec7ca6ac272a1b292185/llvm/lib/Transforms/Scalar/LoopFuse.cpp#L668))**: A simplified version of `isControlFlowEquivalent` that only verifies if first loop dominates the second and the second post-dominates the first would be sufficient here.
  2. **During fusion ([fuseCandidates](https://github.com/llvm/llvm-project/blob/de53b1a4ef4cd7c641b3ec7ca6ac272a1b292185/llvm/lib/Transforms/Scalar/LoopFuse.cpp#L859))** in [`isSafeToMoveBefore`](https://github.com/llvm/llvm-project/blob/a38f84718713b5c6fedb1ca0f2e800c2f1851fa1/llvm/lib/Transforms/Utils/CodeMoverUtils.cpp#L312): At this stage, loop legality has already been verified the control-flow of the loops, and re-checking control-flow equivalence is redundant.[^1]

### Approach 2: Remove **isControlFlowEquivalent**
In this approach, we check the adjacency of the loops during collection rather than fusion:

  1. **During candidate collection (collectFusionCandidates)**: `isControlFlowEquivalent` is replaced with `isAdjacent`.
  2.  **During fusion (fuseCandidates)**: Condition `isAdjacent` is removed. `isControlFlowEquivalent` is also removed from `isSafeToMoveBefore` because of the same reason as in the first approach.

* In this approach, dominancy and post-dominancy of the loops are not explicitly checked anymore. No need to check whether `PDT->dominates(FC1, FC0)`, as this condition is already verified implicitly in isAdjacent. But for the guarded loops, isAdjacent is not a sufficient check to make sure that  `DT->dominates(FC0, FC1)`. So one additional change that comes with this approach is to check this dominancy along with isAdjacent.


### Additional Effects

- Simplify **FusionCandidateCompare**
In either approaches, ordering can be determined only by checking which loop dominates the other, reducing the number of required checks [here](https://github.com/llvm/llvm-project/blob/de53b1a4ef4cd7c641b3ec7ca6ac272a1b292185/llvm/lib/Transforms/Scalar/LoopFuse.cpp#L421).

### Comparing the computational complexity[^2]

- **Original loop fusion**
  - During collection (collectFusionCandidates), it checks each new candidate with the beginning of each candidate set, O(n.k).
  - During fusion (fuseCandidates), it checks each loop with all others in the set, O(n^2/k).

- **Approach 1**
  - During collection (collectFusionCandidates), it checks each new candidate with the beginning of each candidate set, O(n.k).
  - During fusion (fuseCandidates), it checks each loop only with its following loop in the set, since it is the only possible adjacent loop, O(n).

- **Approach 2**
  - During collection (collectFusionCandidates), since the condition to keep in the same set is adjacency, each new candidates need to be checked with all members of all sets, O(n^2).
  - During fusion (fuseCandidates), it checks each loop only with its following loop in the set, since it is the only adjacent loop, O(n).


[^1]: Before reaching the check `isSafeToMoveBefore`, the control-flow of the loops, and the equivalency of the guard conditions are already checked.
[^2]: n is the number of loops, k is the number of candidate sets.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to