frasercrmck wrote:

> Is there is way to test it based on llvm code base?

Not that I'm aware of.

We can probably go by the documented ULP both for 
[CUDA](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#standard-functions)
 and 
[OpenCL](https://registry.khronos.org/OpenCL/specs/3.0-unified/html/OpenCL_C.html#relative-error-as-ulps),
 I think:

| Function  | OpenCL max ULP | NVIDIA max ULP |
|----------|-----|---------|
| sqrtf | 3 | 1 or 3 (depending on architecture) |
| sqrt  | correctly rounded | 0 (IEEE-754 round-to-nearest-even) |
| sinpif | 4  | 1 |
| sinpi | 4 | 2 |
| rsqrtf | 2  | 2 |
| rsqrt | 2 | 1 |
| logf | 3 | 1 |
| log | 3 | 1 |

So judging by this we're most likely okay. I'm assuming `__nv_isinf` is 
correct. I wonder if it's even worth us using that implementation.

Note also that the NVIDIA documentation is explicitly for the CUDA built-in 
function, e.g., `log` not `__nv_log`. I'm assuming here that `__nv_x` which 
we're using is the full implementation of `x`.

https://github.com/llvm/llvm-project/pull/150174
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to