[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

Jonas Hahnfeld via Phabricator via cfe-commits Wed, 08 Aug 2018 06:39:26 -0700

Hahnfeld added a comment.

In https://reviews.llvm.org/D47849#1192134, @gtbercea wrote:

> This patch is concerned with calling device functions when you're on the 
> device. The correctness issues you mention are orthogonal to this and should 
> be handled by another patch. I don't think this patch should be held up any 
> longer.

I'm confused by now, could you please highlight the point that I'm missing?

IIRC you started to work on this to fix the problem with inline assembly (see 
https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes declarations 
of math functions but you still cannot include `math.h` which most "correct" 
codes do.

In https://reviews.llvm.org/D47849#1170670, @tra wrote:

> The rumors of "high performance" functions in the libdevice are somewhat 
> exaggerated , IMO. If you take a look at the IR in the libdevice of recent 
> CUDA version, you will see that a lot of the functions just call their llvm 
> counterpart. If it turns out that in some case llvm generates slower code 
> than what nvidia provides, I'm sure it will be possible to implement a 
> reasonably fast replacement.

So regarding performance it's not yet clear to me which cases actually benefit: 
Is there a particular function that is slow if LLVM's backend resolves the call 
vs. the wrapper script directly calls libdevice?
If I understand @tra's comment correctly, I think we should have clear evidence 
(ie a small "benchmark") that this patch actually improves performance.

Repository:
  rC Clang

https://reviews.llvm.org/D47849

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

Reply via email to