wuyii8941 opened a new issue, #19572:
URL: https://github.com/apache/tvm/issues/19572
## Summary
Multiple operators handle NaN differently from ONNX Runtime when accessed
through the ONNX frontend:
1. **Relu(NaN) → 0** (ORT: NaN)
2. **Sign(NaN) → 0** (ORT: NaN)
3. **ReduceMax/ReduceMin** — position-dependent NaN behavior:
- `ReduceMax([NaN, 1.0]) → 1.0` (ORT: NaN)
- `ReduceMax([2.0, NaN]) → NaN` (ORT: 2.0)
Related: #xxx (bug_019, reduce_max/min NaN CPU vs CUDA at Relax IR level)
## Reproduction
```python
import numpy as np
import onnx
from onnx import helper, TensorProto, numpy_helper
import onnxruntime as ort
import tvm
from tvm import relax
from tvm.relax.frontend.onnx import from_onnx
def run_tvm(model, inputs):
model = onnx.shape_inference.infer_shapes(model)
mod = from_onnx(model)
pipeline = tvm.ir.transform.Sequential([relax.transform.LegalizeOps()])
exe = tvm.relax.build(pipeline(mod), target="llvm")
vm = tvm.relax.VirtualMachine(exe, device=tvm.cpu())
tvm_ins = [tvm.runtime.tensor(v, device=tvm.cpu()) for v in inputs]
return vm["main"](*tvm_ins).numpy()
# Relu
x = np.array([np.nan, 1.0, np.nan, -2.0], dtype=np.float32)
X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [4])
Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [4])
node = helper.make_node("Relu", ["X"], ["Y"])
graph = helper.make_graph([node], "test", [X], [Y])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])
sess = ort.InferenceSession(model.SerializeToString())
print("ORT Relu:", sess.run(None, {"X": x})[0]) # [nan 1. nan 0.]
print("TVM Relu:", run_tvm(model, [x])) # [0. 1. 0. 0.]
# ReduceMax
x2 = np.array([[np.nan, 1.0], [2.0, np.nan]], dtype=np.float32)
X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [2, 2])
Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, None)
axes_init = numpy_helper.from_array(np.array([1], dtype=np.int64), "axes")
node = helper.make_node("ReduceMax", ["X", "axes"], ["Y"], keepdims=0)
graph = helper.make_graph([node], "test", [X], [Y], initializer=[axes_init])
model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])
sess = ort.InferenceSession(model.SerializeToString())
print("ORT ReduceMax:", sess.run(None, {"X": x2})[0]) # [nan 2.]
print("TVM ReduceMax:", run_tvm(model, [x2])) # [ 1. nan]
```
## Root cause
- **Relu**: Lowered to `max(x, 0)` using `fmax` semantics — NaN treated as
missing value
- **Sign**: Comparison chain (`x > 0 → 1, x < 0 → -1, else 0`) — NaN falls
to default 0
- **ReduceMax/Min**: Left-fold with `fmax`/`fmin` — NaN propagation depends
on position in fold order
## Note
We acknowledge that the ONNX spec does not normatively require NaN
propagation for these operators. However, ONNX Runtime (the reference
implementation) propagates NaN consistently, and TVM's behavior causes silent
numerical divergence when migrating models between runtimes. We report this as
a behavioral inconsistency.
## Environment
- TVM: 0.24.dev0, commit 0b0afd8dd (2026-04-24)
- Python: 3.11
- OS: Linux
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]