frasercrmck wrote: > Is there is way to test it based on llvm code base?
Not that I'm aware of. We can probably go by the documented ULP both for [CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#standard-functions) and [OpenCL](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#relative-error-as-ulps), I think: | Function | OpenCL max ULP | NVIDIA max ULP | |----------|-----|---------| | sqrtf | 3 | 1 or 3 (depending on architecture) | | sqrt | correctly rounded | 0 (IEEE-754 round-to-nearest-even) | | sinpif | 4 | 1 | | sinpi | 4 | 2 | | rsqrtf | 2 | 2 | | rsqrt | 2 | 1 | | logf | 3 | 1 | | log | 3 | 1 | So judging by this we're most likely okay. I'm assuming `__nv_isinf` is correct. I wonder if it's even worth us using that implementation. Note also that the NVIDIA documentation is explicitly for the CUDA built-in function, e.g., `log` not `__nv_log`. I'm assuming here that `__nv_x` which we're using is the full implementation of `x`. https://github.com/llvm/llvm-project/pull/150174 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits