Issue 83310
Summary clang++ miscompiles vectorizable comparisons by effectively eliding checks for NaN when FP exceptions are enabled
Labels
Assignees
Reporter nmusolino
    I have encountered a potential bug, in which `clang++-15` miscompiles C++ code that is written to _avoid_ invalid floating point operations (within the meaning in the IEEE-754 standard), such as `x < y` where either `x` or `y` is NaN:
```cpp
    if (std::isnan(lhs[i]) || std::isnan(rhs[i])) {
 result[i] = std::numeric_limits<int32_t>::min();
    } else {
 result[i] = lhs[i] < rhs[i];  // Would raise FE_INVALID if either operand is NaN
    }
```
This miscompilation is observable when the program [enables the invalid exception](https://www.gnu.org/software/libc/manual/html_node/Control-Functions.html) using `feenableexcept(FE_INVALID);`.

Concretely, `clang++` emits instructions such as `vcmpltpd` that raise floating point exceptions when a silent NaN (SNaN) operand is encountered; these cause the program to receive `SIGFPE` when traps for `FE_INVALID` are enabled.  

>From the `dis` command in `lldb-15`:
```
->  0x555555555620 <+1008>: vcmpltpd %ymm4, %ymm3, %ymm3
    0x555555555625 <+1013>: vextractf128 $0x1, %ymm3, %xmm4
    0x55555555562b <+1019>: vshufps $0x88, %xmm4, %xmm3, %xmm3 ; xmm3 = xmm3[0,2],xmm4[0,2] 
    0x555555555630 <+1024>: vandps %xmm1, %xmm3, %xmm1
    0x555555555634 <+1028>: vextractf128 $0x1, %ymm0, %xmm3
    0x55555555563a <+1034>: vpackssdw %xmm3, %xmm0, %xmm0
 0x55555555563e <+1038>: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
 0x555555555644 <+1044>: vmovups %xmm0, 0x70(%rax)
```

The compiler ought to emit alternative instructions (such as `vcmplt_oqpd` or `vucomisd`?) that do not raise floating point exceptions on SNaN inputs.  

For background, GCC had similar bug reports in [#101634](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101634) and [#100778](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100778), although the examples in those reports appear to relate to divide-by-zero exceptions.

In the example below, the bug occurs with `-O2` and `-O3`, but not `-O1`.  The `-march=ivybridge` option is also required to reproduce the bug on my system.

I am not able to test with `clang++` versions 16 or 17 on my system, but the example below also triggers `SIGFPE` on the Compiler Explorer tool at godbolt.org with `clang 16.0.0` and `clang 17.0.1` for `x86_64`.

### Reproduction

```cpp
#include <span>
#include <vector>
#include <iostream>
#include <limits>
#include <cmath>
#include <cfenv>

void Less(std::span<int> result, std::span<const double> lhs, std::span<const double> rhs) {
  for (std::size_t i = 0; i < lhs.size(); ++i) {
    if (std::isnan(lhs[i]) || std::isnan(rhs[i])) {
      result[i] = std::numeric_limits<int32_t>::min();
    } else {
      result[i] = lhs[i] < rhs[i];
    }
  }
}

int main() {
 feenableexcept(FE_INVALID);

  std::size_t n = 32;  // Must be 32 or larger.

  constexpr double nan = std::numeric_limits<double>::quiet_NaN();
  std::vector<double> lhs(n, 0.0);
  lhs.at(n - 1) = nan;
  std::vector<double> rhs(n, 0.0); 

  std::vector<int> result(n);

  Less(std::span<int>(result), std::span<const double>{lhs}, std::span<const double>{rhs});

 std::cout << "element " << n - 1 << ": " << result.at(n-1) << std::endl;
}
```

Compilation command:
```
clang++-15 -O3 -march=ivybridge --std=c++20 -g ./clang15_repro.cpp
```

Output of `clang++-15 --version`:
```
Ubuntu clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to