[llvm-branch-commits] [CodeGen] Limit number of analyzed predecessors (PR #142584)
@@ -1030,6 +1036,11 @@ bool MachineBlockPlacement::isTrellis(
SmallPtrSet SeenPreds;
for (MachineBasicBlock *Succ : ViableSuccs) {
+// Compile-time optimization: runtime is quadratic in the number of
+// predecessors. For such uncommon cases, exit early.
+if (Succ->pred_size() > PredecessorLimit)
aengelke wrote:
Consider the code from the test generator below. From my understanding,
`buildChain` will iterate over all basic blocks of the chain and call
`selectBestSuccessor` for each of them, which will in turn call `isTrellis` for
every block. `isTrellis` will look at the predecessors of all successors, in
particular, it will look at all predecessors of the `merge` block, which are
all the other blocks => for almost every block, the code looks at almost all
other blocks.
Test generator, try n=4:
```python
import sys
n = int(sys.argv[1])
print("declare void @exit(i32)")
print("declare i1 @cond(i32)")
print("define i32 @f(i32 %v, i32 %v0) {")
for i in range(n):
print(f'b{i}:')
print(f' %v{i+1} = add i32 %v{i}, %v')
print(f' %c{i} = call i1 @cond(i32 %v{i+1})')
print(f' br i1 %c{i}, label %merge, label %b{i+1}')
print(f'b{n}:')
print(f' ret i32 %v{n}')
print('merge:')
print(' call void @exit(i32 1)')
print(' unreachable')
print('}')
```
```console
# Without this change
$ python3 many-preds3.test 4 | /usr/bin/time ./llvm-build/bin/llc
-filetype=obj -o /dev/null -O1
15.93user 0.17system 0:16.18elapsed 99%CPU (0avgtext+0avgdata
457748maxresident)k
0inputs+0outputs (0major+99524minor)pagefaults 0swaps
# With this change
$ python3 many-preds3.test 4 | /usr/bin/time ./llvm-build/bin/llc
-filetype=obj -o /dev/null -O1
8.82user 0.19system 0:09.10elapsed 99%CPU (0avgtext+0avgdata 457240maxresident)k
0inputs+0outputs (0major+99425minor)pagefaults 0swaps
```
https://github.com/llvm/llvm-project/pull/142584
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CodeGen] Limit number of analyzed predecessors (PR #142584)
@@ -1030,6 +1036,11 @@ bool MachineBlockPlacement::isTrellis(
SmallPtrSet SeenPreds;
for (MachineBasicBlock *Succ : ViableSuccs) {
+// Compile-time optimization: runtime is quadratic in the number of
+// predecessors. For such uncommon cases, exit early.
+if (Succ->pred_size() > PredecessorLimit)
spupyrev wrote:
Just to confirm I understand correctly: you have an example of (generated) IR
where the runtime is large without the diff and is much smaller with the
proposed change?
(I'm not familiar with this function but I don't see where quadratic complexity
is.)
https://github.com/llvm/llvm-project/pull/142584
___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CodeGen] Limit number of analyzed predecessors (PR #142584)
https://github.com/aengelke created https://github.com/llvm/llvm-project/pull/142584 MachineBlockPlacement has quadratic runtime in the number of predecessors: in some situation, for an edge, all predecessors of the successor are considered. Limit the number of considered predecessors to bound compile time for large functions. ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [CodeGen] Limit number of analyzed predecessors (PR #142584)
aengelke wrote: NB: I don't claim to fully understand what this code does, but it seems to be safe to return a default value. [Example generator](https://github.com/tpde2/tpde/blob/f6e87d2e97f49f403c12a27e7cf513a44f0f5dbc/tpde-llvm/test/filetest/many-preds.test) to demonstrate the behavior, e.g. with 10k predecessors. https://github.com/llvm/llvm-project/pull/142584 ___ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
