swjng opened a new pull request, #19568:
URL: https://github.com/apache/tvm/pull/19568

   ## Summary
   
   Five LLVM legalize rules in `src/target/llvm/intrin_rule_llvm.cc` use inline 
mathematical identities that fail on representable inputs because the 
intermediate computation overflows or cancels, even though the true result is 
in `float32` range:
   
   | Op | Inline form | Failure | True result |
   |---|---|---|---|
   | `sinh`/`cosh` (#19559) | `(exp(x) ± exp(-x)) / 2` | `exp(89) > FLT_MAX`, 
intermediate is `inf` | `sinh(89) ≈ 2.24e38` |
   | `atan` (#19560) | `asin(x / sqrt(x²+1))` | `x²` overflows for `|x| > 
1.84e19`, then `x/inf=0`, `asin(0)=0` | `±π/2` |
   | `asinh` (#19561) | `log(x + sqrt(x²+1))` | same `x²` overflow → 
`log(inf)=inf` | `asinh(3e22) ≈ 52.45` |
   | `erf` (#19562) | A&S `1 − poly(t)·exp(−x²)` | `poly·exp(−x²) ≈ 1` for tiny 
`|x|`; subtraction cancels to 0 | `erf(3e-12) ≈ 3.4e-12` |
   
   ## Fix
   
   Route all five through the existing `DispatchPureExtern<FloatSuffix>` helper 
— i.e. `sinhf`, `coshf`, `atanf`, `asinhf`, `erff` — the same pattern 
`asin`/`acos` use after #19567. ULP-grade accuracy across the reported ranges.
   
   ```
   sinh(89.0):    ORT=2.244806e+38  TVM=2.244806e+38  (was inf)
   atan(3e22):    ORT=1.5707964     TVM=1.5707963     (was 0.0)
   asinh(3e22):   ORT=52.44863      TVM=52.44863      (was inf)
   erf(3e-12):    ORT=3.385e-12     TVM=3.385e-12     (was 0.0)
   ```
   
   `Atan` is re-enabled in `test_unary`; the overflow that previously broke it 
is fixed.
   
   ## Notes for reviewers
   
   **Inline-vs-extern decision.** If the inline identities were a deliberate 
fast-path (e.g. for autovectorization or to avoid extern-call overhead in tight 
loops), please flag it and I'll switch to stable inline forms instead — `exp(x 
− ln 2) ± exp(−x − ln 2)` for sinh/cosh, range-reduced asinh 
`sign(x)·log(2|x|)` for large `|x|`, small-`|x|` Taylor branch for erf, etc. I 
could not find evidence of such intent in the git history (sinh/cosh: original 
commit; atan/asinh: #17945 follow-up; erf: #18104 was framed as "more precise 
than tanh-approx", not "fast inline").
   
   **Acosh.** Same `sqrt(x²−1)` overflow pattern but no filed issue. Happy to 
include as a follow-up if maintainers want.
   
   Fixes #19559.
   Fixes #19560.
   Fixes #19561.
   Fixes #19562.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to