Issue 124738
Summary Vectorization. Reduce with min seems weird for floats (with ffast math)
Labels new issue
Assignees
Reporter DenisYaroshevskiy
    I'm looking at (with -march=haswell -ffast-math)

```
float tst(std::span<const float> x) {
    return std::reduce(x.begin(), x.end(), x[0], [](float a, float b) {
        return std::min(a, b);
    });
}
```
https://godbolt.org/z/1EjWG4x6f

The codegen does a lot of shuffles. I'm very confused by that

In eve library the main loop is just:

```
.LBB0_21:
        vminps  ymm0, ymm0, ymmword ptr [rdi]
        vminps  ymm3, ymm3, ymmword ptr [rdi + 32]
        vminps ymm4, ymm4, ymmword ptr [rdi + 64]
        vminps  ymm2, ymm2, ymmword ptr [rdi + 96]
        vminps  ymm0, ymm0, ymmword ptr [rdi + 128]
 vminps  ymm3, ymm3, ymmword ptr [rdi + 160]
        vminps  ymm4, ymm4, ymmword ptr [rdi + 192]
        vminps  ymm2, ymm2, ymmword ptr [rdi + 224]
        vminps  ymm0, ymm0, ymmword ptr [rdi + 256]
        vminps ymm3, ymm3, ymmword ptr [rdi + 288]
        vminps  ymm4, ymm4, ymmword ptr [rdi + 320]
        vminps  ymm2, ymm2, ymmword ptr [rdi + 352]
 vminps  ymm0, ymm0, ymmword ptr [rdi + 384]
        vminps  ymm3, ymm3, ymmword ptr [rdi + 416]
        vminps  ymm4, ymm4, ymmword ptr [rdi + 448]
        vminps  ymm2, ymm2, ymmword ptr [rdi + 480]
        add rdi, 512
        add     r8, -4
        jne .LBB0_21
```
https://godbolt.org/z/7hqYTcxEo

Is there a reason that's not what reduce (with ffast math) does? On the first glance I'd expect it to be OK. 
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to