| Issue |
124738
|
| Summary |
Vectorization. Reduce with min seems weird for floats (with ffast math)
|
| Labels |
new issue
|
| Assignees |
|
| Reporter |
DenisYaroshevskiy
|
I'm looking at (with -march=haswell -ffast-math)
```
float tst(std::span<const float> x) {
return std::reduce(x.begin(), x.end(), x[0], [](float a, float b) {
return std::min(a, b);
});
}
```
https://godbolt.org/z/1EjWG4x6f
The codegen does a lot of shuffles. I'm very confused by that
In eve library the main loop is just:
```
.LBB0_21:
vminps ymm0, ymm0, ymmword ptr [rdi]
vminps ymm3, ymm3, ymmword ptr [rdi + 32]
vminps ymm4, ymm4, ymmword ptr [rdi + 64]
vminps ymm2, ymm2, ymmword ptr [rdi + 96]
vminps ymm0, ymm0, ymmword ptr [rdi + 128]
vminps ymm3, ymm3, ymmword ptr [rdi + 160]
vminps ymm4, ymm4, ymmword ptr [rdi + 192]
vminps ymm2, ymm2, ymmword ptr [rdi + 224]
vminps ymm0, ymm0, ymmword ptr [rdi + 256]
vminps ymm3, ymm3, ymmword ptr [rdi + 288]
vminps ymm4, ymm4, ymmword ptr [rdi + 320]
vminps ymm2, ymm2, ymmword ptr [rdi + 352]
vminps ymm0, ymm0, ymmword ptr [rdi + 384]
vminps ymm3, ymm3, ymmword ptr [rdi + 416]
vminps ymm4, ymm4, ymmword ptr [rdi + 448]
vminps ymm2, ymm2, ymmword ptr [rdi + 480]
add rdi, 512
add r8, -4
jne .LBB0_21
```
https://godbolt.org/z/7hqYTcxEo
Is there a reason that's not what reduce (with ffast math) does? On the first glance I'd expect it to be OK.
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs