wuyii8941 opened a new issue, #19572:
URL: https://github.com/apache/tvm/issues/19572

   
   ## Summary
   
   Multiple operators handle NaN differently from ONNX Runtime when accessed 
through the ONNX frontend:
   
   1. **Relu(NaN) → 0** (ORT: NaN)
   2. **Sign(NaN) → 0** (ORT: NaN)
   3. **ReduceMax/ReduceMin** — position-dependent NaN behavior:
      - `ReduceMax([NaN, 1.0]) → 1.0` (ORT: NaN)
      - `ReduceMax([2.0, NaN]) → NaN` (ORT: 2.0)
   
   Related: #xxx (bug_019, reduce_max/min NaN CPU vs CUDA at Relax IR level)
   
   ## Reproduction
   
   ```python
   import numpy as np
   import onnx
   from onnx import helper, TensorProto, numpy_helper
   import onnxruntime as ort
   import tvm
   from tvm import relax
   from tvm.relax.frontend.onnx import from_onnx
   
   def run_tvm(model, inputs):
       model = onnx.shape_inference.infer_shapes(model)
       mod = from_onnx(model)
       pipeline = tvm.ir.transform.Sequential([relax.transform.LegalizeOps()])
       exe = tvm.relax.build(pipeline(mod), target="llvm")
       vm = tvm.relax.VirtualMachine(exe, device=tvm.cpu())
       tvm_ins = [tvm.runtime.tensor(v, device=tvm.cpu()) for v in inputs]
       return vm["main"](*tvm_ins).numpy()
   
   # Relu
   x = np.array([np.nan, 1.0, np.nan, -2.0], dtype=np.float32)
   X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [4])
   Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, [4])
   node = helper.make_node("Relu", ["X"], ["Y"])
   graph = helper.make_graph([node], "test", [X], [Y])
   model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])
   
   sess = ort.InferenceSession(model.SerializeToString())
   print("ORT Relu:", sess.run(None, {"X": x})[0])   # [nan  1. nan  0.]
   print("TVM Relu:", run_tvm(model, [x]))             # [0.   1. 0.   0.]
   
   # ReduceMax
   x2 = np.array([[np.nan, 1.0], [2.0, np.nan]], dtype=np.float32)
   X = helper.make_tensor_value_info("X", TensorProto.FLOAT, [2, 2])
   Y = helper.make_tensor_value_info("Y", TensorProto.FLOAT, None)
   axes_init = numpy_helper.from_array(np.array([1], dtype=np.int64), "axes")
   node = helper.make_node("ReduceMax", ["X", "axes"], ["Y"], keepdims=0)
   graph = helper.make_graph([node], "test", [X], [Y], initializer=[axes_init])
   model = helper.make_model(graph, opset_imports=[helper.make_opsetid("", 18)])
   
   sess = ort.InferenceSession(model.SerializeToString())
   print("ORT ReduceMax:", sess.run(None, {"X": x2})[0])  # [nan  2.]
   print("TVM ReduceMax:", run_tvm(model, [x2]))           # [ 1. nan]
   ```
   
   ## Root cause
   
   - **Relu**: Lowered to `max(x, 0)` using `fmax` semantics — NaN treated as 
missing value
   - **Sign**: Comparison chain (`x > 0 → 1, x < 0 → -1, else 0`) — NaN falls 
to default 0
   - **ReduceMax/Min**: Left-fold with `fmax`/`fmin` — NaN propagation depends 
on position in fold order
   
   ## Note
   
   We acknowledge that the ONNX spec does not normatively require NaN 
propagation for these operators. However, ONNX Runtime (the reference 
implementation) propagates NaN consistently, and TVM's behavior causes silent 
numerical divergence when migrating models between runtimes. We report this as 
a behavioral inconsistency.
   
   ## Environment
   
   - TVM: 0.24.dev0, commit 0b0afd8dd (2026-04-24)
   - Python: 3.11
   - OS: Linux
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to