| Issue |
86635
|
| Summary |
[aarch64] Prefer use callee save register to avoid repeat load when cross call
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
vfdff
|
* test: https://gcc.godbolt.org/z/hdPf8veEc
```
float foo (float num[], float r2inv, int n) {
float sum = 0.0;
for (int i=0; i < 1000; i++) {
float a = num[i];
// const float r = a / std::sqrt (r2inv);
const float expm2 = std::exp (a);
float tmp = expm2 * num[i];
sum += tmp * tmp;
}
return sum;
}
```
* llvm: When built with O3, use 2 load to get num[i] **before** and **after** the call expf
```
.LBB0_1: // =>This Inner Loop Header: Depth=1
ldr s0, [x19, x20] // load num[i]
bl expf
ldr s1, [x19, x20] // repeat load the num[i]
add x20, x20, #4
cmp x20, #4000
fmul s0, s0, s1
fmadd s8, s0, s0, s8
b.ne .LBB0_1
```
* gcc: Only one load for num[i] because it use the callee save register
```
.L2:
ldr s15, [x19], 4
fmov s0, s15
bl expf
fmul s15, s15, s0
fmadd s14, s15, s15, s14
cmp x19, x20
bne .L2
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs