| Issue |
83310
|
| Summary |
clang++ miscompiles vectorizable comparisons by effectively eliding checks for NaN when FP exceptions are enabled
|
| Labels |
|
| Assignees |
|
| Reporter |
nmusolino
|
I have encountered a potential bug, in which `clang++-15` miscompiles C++ code that is written to _avoid_ invalid floating point operations (within the meaning in the IEEE-754 standard), such as `x < y` where either `x` or `y` is NaN:
```cpp
if (std::isnan(lhs[i]) || std::isnan(rhs[i])) {
result[i] = std::numeric_limits<int32_t>::min();
} else {
result[i] = lhs[i] < rhs[i]; // Would raise FE_INVALID if either operand is NaN
}
```
This miscompilation is observable when the program [enables the invalid exception](https://www.gnu.org/software/libc/manual/html_node/Control-Functions.html) using `feenableexcept(FE_INVALID);`.
Concretely, `clang++` emits instructions such as `vcmpltpd` that raise floating point exceptions when a silent NaN (SNaN) operand is encountered; these cause the program to receive `SIGFPE` when traps for `FE_INVALID` are enabled.
>From the `dis` command in `lldb-15`:
```
-> 0x555555555620 <+1008>: vcmpltpd %ymm4, %ymm3, %ymm3
0x555555555625 <+1013>: vextractf128 $0x1, %ymm3, %xmm4
0x55555555562b <+1019>: vshufps $0x88, %xmm4, %xmm3, %xmm3 ; xmm3 = xmm3[0,2],xmm4[0,2]
0x555555555630 <+1024>: vandps %xmm1, %xmm3, %xmm1
0x555555555634 <+1028>: vextractf128 $0x1, %ymm0, %xmm3
0x55555555563a <+1034>: vpackssdw %xmm3, %xmm0, %xmm0
0x55555555563e <+1038>: vblendvps %xmm0, %xmm2, %xmm1, %xmm0
0x555555555644 <+1044>: vmovups %xmm0, 0x70(%rax)
```
The compiler ought to emit alternative instructions (such as `vcmplt_oqpd` or `vucomisd`?) that do not raise floating point exceptions on SNaN inputs.
For background, GCC had similar bug reports in [#101634](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101634) and [#100778](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100778), although the examples in those reports appear to relate to divide-by-zero exceptions.
In the example below, the bug occurs with `-O2` and `-O3`, but not `-O1`. The `-march=ivybridge` option is also required to reproduce the bug on my system.
I am not able to test with `clang++` versions 16 or 17 on my system, but the example below also triggers `SIGFPE` on the Compiler Explorer tool at godbolt.org with `clang 16.0.0` and `clang 17.0.1` for `x86_64`.
### Reproduction
```cpp
#include <span>
#include <vector>
#include <iostream>
#include <limits>
#include <cmath>
#include <cfenv>
void Less(std::span<int> result, std::span<const double> lhs, std::span<const double> rhs) {
for (std::size_t i = 0; i < lhs.size(); ++i) {
if (std::isnan(lhs[i]) || std::isnan(rhs[i])) {
result[i] = std::numeric_limits<int32_t>::min();
} else {
result[i] = lhs[i] < rhs[i];
}
}
}
int main() {
feenableexcept(FE_INVALID);
std::size_t n = 32; // Must be 32 or larger.
constexpr double nan = std::numeric_limits<double>::quiet_NaN();
std::vector<double> lhs(n, 0.0);
lhs.at(n - 1) = nan;
std::vector<double> rhs(n, 0.0);
std::vector<int> result(n);
Less(std::span<int>(result), std::span<const double>{lhs}, std::span<const double>{rhs});
std::cout << "element " << n - 1 << ": " << result.at(n-1) << std::endl;
}
```
Compilation command:
```
clang++-15 -O3 -march=ivybridge --std=c++20 -g ./clang15_repro.cpp
```
Output of `clang++-15 --version`:
```
Ubuntu clang version 15.0.7
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs