[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-05-15 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea abandoned this revision. gtbercea added a comment. Replaced by: D61399 Repository: rC Clang CHANGES SINCE LAST ACTION https://reviews.llvm.org/D47849/new/ https://reviews.llvm.org/D47849 ___ cfe-commi

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-04-29 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. In D47849#1435770 , @hfinkel wrote: > We need to make progress on this, and I'd like to suggest a path forward... > > First, we have a fundamental problem here: Using host headers to declare > functions for the device execution

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-03-22 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. Thank you both for the feedback. It's good to see that there's an interest to move this forward, I will try to refactor this patch according to Hal's suggestions and see if there are any blockers. Thanks! Repository: rC Clang CHANGES SINCE LAST ACTION https://r

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-03-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > This is, or is very similar to, the problem that the host/device overloading > addresses in CUDA. IIRC the difference was that OpenMP didn't have explicit notion of host/device functions which made it hard to apply host/device overloading in practice. > It is also the

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2019-03-19 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment. We need to make progress on this, and I'd like to suggest a path forward... First, we have a fundamental problem here: Using host headers to declare functions for the device execution environment isn't sound. Those host headers can do anything, and while some platforms

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-23 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Basic/Targets/NVPTX.cpp:232 + // getting inlined on the device. + Builder.defineMacro("__NO_MATH_INLINES"); } tra wrote: > This relies on implementation detail of particular variant of the header file > you're a

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. `__clang_cuda_device_functions.h` is not intended to be a device-side math.h, despite having a lot of overlap/similarities. It may change at any time we get new CUDA version. I would suggest writing an OpenMP-specific replacement for math.h which would map to whatever devic

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-22 Thread Greg Rodgers via Phabricator via cfe-commits
gregrodgers added a comment. I like the idea of using an automatic include as a cc1 option (-include). However, I would prefer a more general automatic include for OpenMP, not just for math functions (__clang_cuda_device_functions.h). Clang cuda automatically includes __clang_cuda_runtime_wrap

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 160598. gtbercea added a comment. Herald added a subscriber: jholewinski. Add __NO_MATH_INLINES macro for the NVPTX toolchain to prevent any host assembly from seeping onto the device. Repository: rC Clang https://reviews.llvm.org/D47849 Files: inclu

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. Just to address any generality concerns: This patch fixes the problem of calling libdevice math functions for all platform combinations. It ensures that the OpenMP NVPTX target region will NOT call any host math functions (which ever host that may be) IF equivalent dev

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. Thanks @Hahnfeld for your suggestions. Unfortunately doing the lowering in the backend one would need to replace the math function calls with calls to libdevice function calls. I have not been able to do that in an elegant way. Encoding the interface to libdevice is ju

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-10 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > I don't want to use a fast `pow(a, 2)`, I don't want to call a library > function for that at all. I do believe you won't end up calling a function. If you're compiling with optimizations on this will be inlined. Repository: rC Clang https://reviews.llvm.org

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-10 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > The downside of this approach is that LLVM doesn't recognize these function > calls and doesn't perform optimizations to fold libcalls. For example `pow(a, > 2)` is transformed into a multiplication but `__nv_pow(a, 2)` is not. Doesn't CUDA have the same problem?

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-10 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld removed a reviewer: Hahnfeld. Hahnfeld added a comment. I feel like there is no progress in the discussion (here and off-list), partly because we might still not be talking about the same things. So I'm stepping down from this revision to unblock review from somebody else. Here's my cu

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192493, @gtbercea wrote: > @Hahnfeld do you get the same error if you compile with clang++ instead of > clang? Yes, with both trunk and this patch applied. It's the same header after all... Repository: rC Clang https://reviews.

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. @Hahnfeld do you get the same error if you compile with clang++ instead of clang? Repository: rC Clang https://reviews.llvm.org/D47849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bi

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1192383, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1192375, @gtbercea wrote: > > > I do not get that error. > > > In the beginning you said that you were facing the same error. Did that go > away in the meantime? > Are you

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1192368, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1192321, @gtbercea wrote: > > > > IIRC you started to work on this to fix the problem with inline assembly > > > (see https://reviews.llvm.org/D47849#1125019). AFAICS this

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192375, @gtbercea wrote: > I do not get that error. In the beginning you said that you were facing the same error. Did that go away in the meantime? Are you testing on x86 or Power? With optimizations enabled? Repository: rC Cla

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1192368, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1192321, @gtbercea wrote: > > > > IIRC you started to work on this to fix the problem with inline assembly > > > (see https://reviews.llvm.org/D47849#1125019). AFAICS this

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192321, @gtbercea wrote: > > IIRC you started to work on this to fix the problem with inline assembly > > (see https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes > > declarations of math functions but you still cannot

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > IIRC you started to work on this to fix the problem with inline assembly (see > https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes > declarations of math functions but you still cannot include `math.h` which > most "correct" codes do. I'm not sure w

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1192245, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1192134, @gtbercea wrote: > > > This patch is concerned with calling device functions when you're on the > > device. The correctness issues you mention are orthogonal to th

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1192134, @gtbercea wrote: > This patch is concerned with calling device functions when you're on the > device. The correctness issues you mention are orthogonal to this and should > be handled by another patch. I don't think this patc

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > Ok, so you are already talking about performance. I think we should fix > correctness first, in particular the compiler shouldn't complain whenever > `` is included. This patch is concerned with calling device functions when you're on the device. The correctness i

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-08 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1190997, @gtbercea wrote: > Don't we want to use device specific math functions? > It's not just about avoiding some the host specific assembly, it's also > about getting an implementation tailored to the device. Ok, so you are alre

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 159574. gtbercea added a comment. Prevent math builtins from being used for nvptx toolchain. Repository: rC Clang https://reviews.llvm.org/D47849 Files: include/clang/Driver/ToolChain.h lib/Driver/ToolChains/Clang.cpp lib/Driver/ToolChains/Cuda.cp

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1190903, @Hahnfeld wrote: > Do we still need this? I think what we really need to solve is the problem of > (host) inline assembly in the header files... Don't we want to use device specific math functions? It's not just about avoidi

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-07 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. Do we still need this? I think what we really need to solve is the problem of (host) inline assembly in the header files... Repository: rC Clang https://reviews.llvm.org/D47849 ___ cfe-commits mailing list cfe-commits@l

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 159335. gtbercea added a comment. Fix function call. Repository: rC Clang https://reviews.llvm.org/D47849 Files: include/clang/Driver/ToolChain.h lib/Driver/ToolChains/Clang.cpp lib/Driver/ToolChains/Cuda.cpp lib/Driver/ToolChains/Cuda.h lib

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment. In https://reviews.llvm.org/D47849#1184388, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1184367, @hfinkel wrote: > > > The problem is that the inline assembly might actually be for the target, > > instead of the host, because we also have target preprocessor ma

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1184367, @hfinkel wrote: > The problem is that the inline assembly might actually be for the target, > instead of the host, because we also have target preprocessor macros defined, > and it's going to be hard to tell. I'm not sure tha

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment. In https://reviews.llvm.org/D47849#1183996, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1183150, @hfinkel wrote: > > > Hrmm. Doesn't that make it so that whatever functions are implemented using > > that inline assembly will not be callable from target code (or

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-08-01 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1183150, @hfinkel wrote: > Hrmm. Doesn't that make it so that whatever functions are implemented using > that inline assembly will not be callable from target code (or, perhaps > worse, will crash the backend if called)? You are rig

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-07-31 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment. In https://reviews.llvm.org/D47849#1183134, @Hahnfeld wrote: > In https://reviews.llvm.org/D47849#1124861, @hfinkel wrote: > > > In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > > > > > 2. Incidentally I ran into a closely related problem: I can't `#include

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-07-31 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1124861, @hfinkel wrote: > In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > > > 2. Incidentally I ran into a closely related problem: I can't `#include > > ` in translation units compiled for offloading, Clang complains

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-07-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47849#1126925, @gtbercea wrote: > I just stumbled upon a very interesting situation. > > I noticed that, for OpenMP, the use of device math functions happens as I > expected for -O0. For -O1 or higher math functions such as "sqrt" resolve to > l

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. I just stumbled upon a very interesting situation. I noticed that, for OpenMP, the use of device math functions happens as I expected for -O0. For -O1 or higher math functions such as "sqrt" resolve to llvm builtins/intrinsics: call double @llvm.sqrt.f64(double %1)

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > IMO this goes into the right direction, we should use the fast implementation > in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think > it's ok to use header wrappers as CUDA already does

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. In https://reviews.llvm.org/D47849#1125019, @gtbercea wrote: > It's precisely the issue which you report here. Since you don't use device > specific math functions, you can run into the problem where you may end up > calling assembly instructions for a different archit

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > IMO this goes into the right direction, we should use the fast implementation > in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think > it's ok to use header wrappers as CUDA already

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Headers/__clang_cuda_device_functions.h:65 } +#if defined(__cplusplus) __DEVICE__ void __brkpt() { asm volatile("brkpt;"); } Hahnfeld wrote: > Why is that only valid for C++? C does not support overloading of func

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Hal Finkel via Phabricator via cfe-commits
hfinkel added a comment. In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > IMO this goes into the right direction, we should use the fast implementation > in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think > it's ok to use header wrappers as CUDA already

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Jonas Hahnfeld via Phabricator via cfe-commits
Hahnfeld added a comment. IMO this goes into the right direction, we should use the fast implementation in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think it's ok to use header wrappers as CUDA already does. Two questions: 1. Can you explain where this is important f

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-06 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added a comment. Add tests for C++ and move OpenMP specific tests to OpenMP directory Comment at: lib/Headers/__clang_cuda_device_functions.h:28 +#if defined(_OPENMP) +#include <__clang_cuda_libdevice_declares.h> +#include Do we really need to include

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: Hahnfeld, tra, hfinkel, carlo.bertolli, caomhin, ABataev. Herald added subscribers: cfe-commits, guansong, mgorny. In current Clang, on the OpenMP NVPTX toolchain, math functions are resolved as math functions for the host. For example, a