Issue 76448
Summary If `vmaxss` already handles NaN like `fmax`, why is `fmax` so complicated?
Labels new issue
Assignees
Reporter Eisenwave
    This is a possible missed optimization in clang++. It's not a libc++ issue because `fmax` is just a thin wrapper around `__builtin_fmax`:
https://github.com/llvm/llvm-project/blob/1150e8ef7765f43a730575bd224eda18e916ac1e/libcxx/include/__math/min_max.h#L28-L30

## Code to reproduce

```cpp
#include <cmath>

float fmax_(float x, float y) {
    return std::fmax(x, y);
}
```

## Possibly suboptimal output

This compiles to (`clang++ -O3 -stdlib=libc++`) (https://godbolt.org/z/fn4v99sv8)

```asm
fmax_(float, float): # @fmax_(float, float)
        vmaxss  xmm2, xmm1, xmm0
        vcmpunordss     xmm0, xmm0, xmm0
        vblendvps xmm0, xmm2, xmm1, xmm0
        ret
```

Is this a missed optimization? The latter two instructions are solely dedicated to the handling of NaN. Namely `vcmpunordss` detects `isnan(x)` and `vblendvps` selects `y` or `max(x, y)` depending on the result. However, I don't believe this is necessary, given the documented behavior of `maxss`:

```c
MAX(SRC1, SRC2)
{
    IF ((SRC1 = 0.0) and (SRC2 = 0.0)) THEN DEST := SRC2;
        ELSE IF (SRC1 = NaN) THEN DEST := SRC2; FI;
        ELSE IF (SRC2 = NaN) THEN DEST := SRC2; FI;
 ELSE IF (SRC1 > SRC2) THEN DEST := SRC1;
        ELSE DEST := SRC2;
 FI;
}
```

\- https://www.felixcloutier.com/x86/maxss

To me it seems like the NaN-handling behavior of `vcmpunordss` and `vblendvps` is already covered by the first two `ELSE IF` branches. Am I missing something obvious, or is this a missed optimization?
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to