vacu9708 opened a new pull request, #17985:
URL: https://github.com/apache/tvm/pull/17985

   # Summary
   This PR resolves issue [#17965](https://github.com/apache/tvm/issues/17965) 
where the same model produces different outputs on the LLVM (CPU) and CUDA 
(GPU) backends.<br>
   
   # Update
   Two root causes were identified and addressed:
   - The Taylor-series approximation of `asin` did not check its input domain, 
allowing values outside [-1, 1] to silently produce a result instead of NaN.
       - **Fix:**  Update `tir.asin` to return NaN if the input is outside [-1, 
1].
   - Update pooling ops (max/min) on LLVM("propagate-NaN" policy) to follow 
CUDA’s fmax/fmin rules, which treat NaN as missing data:
       - If one operand is NaN and the other is a number, choose the numeric 
value.
       - If both are NaN, return NaN.
   
   # Note
   - I also tried to align the CUDA behavior with LLVM but CUDA  doesn't seem 
to support the "propagate-NaN' policy.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to