wuyii8941 opened a new issue, #19579:
URL: https://github.com/apache/tvm/issues/19579
## Expected behavior
`maximum(NaN, x)` should return `NaN` per IEEE 754-2019 §9.6, consistent
with NumPy, PyTorch, JAX, and ONNX Runtime.
`relu(NaN)` should return `NaN` (since relu = max(x, 0)).
## Actual behavior
When NaN is the **first** operand of `T.max` / `T.min`, the result is the
second operand instead of NaN. This affects `R.maximum`, `R.minimum`,
`R.nn.relu`, and `R.clip`.
The root cause is that `T.max(a, b)` compiles to x86 `maxss`/`maxps`
instructions, which have the hardware behavior: "if **src1** is NaN, return
**src2**". IEEE 754 requires returning NaN when either operand is NaN.
## Reproducer
```python
import numpy as np
import tvm
from tvm import relax
import tvm.relax.op as R
from tvm.relax.transform import LegalizeOps
bb = relax.BlockBuilder()
a = relax.Var("a", relax.TensorStructInfo((4,), "float32"))
b = relax.Var("b", relax.TensorStructInfo((4,), "float32"))
with bb.function("main", [a, b]):
with bb.dataflow():
gv = bb.emit_output(bb.emit(R.maximum(a, b)))
bb.emit_func_output(gv)
mod = bb.finalize()
pipeline = tvm.ir.transform.Sequential([LegalizeOps()])
exe = tvm.relax.build(pipeline(mod), target="llvm")
vm = tvm.relax.VirtualMachine(exe, device=tvm.cpu())
A = np.array([np.nan, 1.0, np.nan, 0.0], np.float32)
B = np.array([1.0, np.nan, np.nan, np.nan], np.float32)
out = vm["main"](
tvm.runtime.tensor(A, device=tvm.cpu()),
tvm.runtime.tensor(B, device=tvm.cpu()),
).numpy()
print(out) # [1. nan nan nan] — element 0 is WRONG
print(np.maximum(A, B)) # [nan nan nan nan] — all NaN per IEEE 754
```
The pattern is operand-order-dependent:
| Expression | TVM | Expected (IEEE 754) |
|---|---|---|
| `max(NaN, 1.0)` | `1.0` | `NaN` |
| `max(1.0, NaN)` | `NaN` | `NaN` |
| `relu(NaN)` = `max(NaN, 0)` | `0.0` | `NaN` |
| `clip(NaN, -1, 1)` | `1.0` | `NaN` |
## Affected operations
```python
R.maximum(a, b) # when a is NaN
R.minimum(a, b) # when a is NaN
R.nn.relu(x) # when x is NaN → returns 0
R.clip(x, lo, hi) # when x is NaN → returns hi
```
Not affected (correct NaN propagation):
- `R.add`, `R.multiply`, `R.subtract`, `R.divide` — arithmetic propagates
NaN correctly
- `R.nn.leakyrelu` — uses comparison path, NaN propagates through multiply
- `R.nn.silu`, `R.nn.gelu` — sigmoid/erf path propagates NaN
## Why this matters
`relu` is the most common activation function. When an upstream computation
produces NaN (e.g., from overflow or division by zero), the NaN should
propagate to signal the error. Instead, TVM's `relu` silently converts NaN to
0, making the error invisible:
```python
# Suppose upstream overflow produces NaN in one element:
x = [[1.0, 2.0, NaN, 4.0]]
relu(x).sum()
# TVM: 7.0 ← NaN silently disappeared
# NumPy: NaN ← correctly signals the problem
```
This can cause silent wrong results in production models, where NaN
detection is a standard debugging/monitoring signal.
## Root cause
In the lowered TIR, `maximum` becomes `T.max(a, b)`, which LLVM lowers to
x86 `maxss`/`maxps`. These instructions follow "if src1 is NaN, return src2"
semantics rather than IEEE 754 "return NaN if either is NaN".
The fix would be to emit NaN-aware max/min, e.g.:
```
select(isnan(a) | isnan(b), NaN, max(a, b))
```
## Environment
- TVM commit: 0b0afd8dd (main, 2026-04-24)
- OS: Ubuntu 20.04
- Target: llvm (CPU, x86-64)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]