Hahnfeld added a comment.

In https://reviews.llvm.org/D47849#1192134, @gtbercea wrote:

> This patch is concerned with calling device functions when you're on the 
> device. The correctness issues you mention are orthogonal to this and should 
> be handled by another patch. I don't think this patch should be held up any 
> longer.


I'm confused by now, could you please highlight the point that I'm missing?

IIRC you started to work on this to fix the problem with inline assembly (see 
https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes declarations 
of math functions but you still cannot include `math.h` which most "correct" 
codes do.

In https://reviews.llvm.org/D47849#1170670, @tra wrote:

> The rumors of "high performance" functions in the libdevice are somewhat 
> exaggerated , IMO. If you take a look at the IR in the libdevice of recent 
> CUDA version, you will see that a lot of the functions just call their llvm 
> counterpart. If it turns out that in some case llvm generates slower code 
> than what nvidia provides, I'm sure it will be possible to implement a 
> reasonably fast replacement.


So regarding performance it's not yet clear to me which cases actually benefit: 
Is there a particular function that is slow if LLVM's backend resolves the call 
vs. the wrapper script directly calls libdevice?
If I understand @tra's comment correctly, I think we should have clear evidence 
(ie a small "benchmark") that this patch actually improves performance.


Repository:
  rC Clang

https://reviews.llvm.org/D47849



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to