gtbercea added a comment.
> The downside of this approach is that LLVM doesn't recognize these function
> calls and doesn't perform optimizations to fold libcalls. For example `pow(a,
> 2)` is transformed into a multiplication but `__nv_pow(a, 2)` is not.
Doesn't CUDA have the same problem?
cfe-commits mailing list