[PATCH] D49188: [OpenMP] Initialize data sharing for SPMD case

2018-07-11 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, carlo.bertolli, caomhin. Herald added subscribers: cfe-commits, guansong, jholewinski. In the SPMD case, we need to initialize the data sharing and globalization infrastructure. This covers the case when an SPMD region calls a

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 150947. gtbercea added a comment. Added separate test. Repository: rC Clang https://reviews.llvm.org/D47394 Files: include/clang/Driver/Action.h include/clang/Driver/Compilation.h include/clang/Driver/Driver.h include/clang/Driver/Options.td

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-11 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea marked 3 inline comments as done. gtbercea added inline comments. Comment at: include/clang/Driver/Compilation.h:312 + /// \param skipBundler - bool value set once by the driver. + void setSkipOffloadBundler(bool skipBundler); + sfantao wrote: > Why

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. I just stumbled upon a very interesting situation. I noticed that, for OpenMP, the use of device math functions happens as I expected for -O0. For -O1 or higher math functions such as "sqrt" resolve to llvm builtins/intrinsics: call double @llvm.sqrt.f64(double %1)

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47849#1124638, @Hahnfeld wrote: > IMO this goes into the right direction, we should use the fast implementation > in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think > it's ok to use header wrappers as CUDA

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-07 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Headers/__clang_cuda_device_functions.h:65 } +#if defined(__cplusplus) __DEVICE__ void __brkpt() { asm volatile("brkpt;"); } Hahnfeld wrote: > Why is that only valid for C++? C does not support overloading of

[PATCH] D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls

2018-06-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: Hahnfeld, tra, hfinkel, carlo.bertolli, caomhin, ABataev. Herald added subscribers: cfe-commits, guansong, mgorny. In current Clang, on the OpenMP NVPTX toolchain, math functions are resolved as math functions for the host. For example,

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. @tra Thank you for your comments and help with the patch. Repository: rC Clang https://reviews.llvm.org/D47394 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-04 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47394#1120255, @Hahnfeld wrote: > In https://reviews.llvm.org/D47394#1119489, @gtbercea wrote: > > > In https://reviews.llvm.org/D47394#1119056, @Hahnfeld wrote: > > > > > Hmm, maybe the scope is much larger: I just tried linking an

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-01 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47394#1119056, @Hahnfeld wrote: > Hmm, maybe the scope is much larger: I just tried linking an executable that > references a `declare target` function in a shared library. My assumption was > that this already works, given that

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-06-01 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > I disagree in this context because this patch currently means that static > archives will only work with NVPTX and there is no clear path how to "fix" > things for other offloading targets. I'll try to work on my proposal over the > next few days (sorry, very busy

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-31 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. The error is related to lack of device linking, just like you explained two paragraphs down. This is the error I get: main.o: In function `__cuda_module_ctor': main.cu:(.text+0x674): undefined reference to `__cudaRegisterLinkedBinary__nv_c5b75865' You nailed the

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-31 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. > Assuming we do proceed with back-to-CUDA approach, one thing I'd consider > would be using clang's -fcuda-include-gpubinary option which CUDA uses to > include GPU code into the host object. You may be able to use it to avoid > compiling and partially linking

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D47394#1114848, @sfantao wrote: > Just to clarify one thing in my last comment: > > When I say that we didn't aim at having clang compatible with other > compilers, I mean the OpenMP offloading descriptors, where all the variables > and

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:536 + } } sfantao wrote: > What prevents all this from being done in the bundler? If I understand it > correctly, if the bundler implements this wrapping all the checks for >

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: test/Driver/openmp-offload.c:497 // RUN: %clang -### -fopenmp=libomp -o %t.out -lsomelib -target powerpc64le-linux -fopenmp-targets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.i -no-canonical-prefixes 2>&1 \ // RUN: |

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-29 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: test/Driver/openmp-offload.c:497 // RUN: %clang -### -fopenmp=libomp -o %t.out -lsomelib -target powerpc64le-linux -fopenmp-targets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.i -no-canonical-prefixes 2>&1 \ // RUN: |

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-25 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 148677. Repository: rC Clang https://reviews.llvm.org/D47394 Files: include/clang/Driver/Action.h include/clang/Driver/Compilation.h include/clang/Driver/Driver.h include/clang/Driver/ToolChain.h lib/Driver/Action.cpp

[PATCH] D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain

2018-05-25 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: Hahnfeld, hfinkel, caomhin, carlo.bertolli, tra. Herald added subscribers: cfe-commits, guansong. So far, the clang-offload-bundler has been the default tool for bundling together various files types produced by the different OpenMP

[PATCH] D44541: [OpenMP][Clang] Move device global stack init before master-workers split

2018-03-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea abandoned this revision. gtbercea added a comment. This leads to usage of statically allocated shared data before their initialization in runtime structures by master thread in kernel_init() function. New patch available with worker and master-side initialization. Repository: rC

[PATCH] D44749: [OpenMP][Clang] Add call to global data sharing stack initialization on the workers side

2018-03-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, grokos, carlo.bertolli, caomhin. Herald added subscribers: cfe-commits, guansong, jholewinski. The workers also need to initialize the global stack. The call to the initialization function needs to happen after the kernel_init()

[PATCH] D44541: [OpenMP][Clang] Move device global stack init before master-workers split

2018-03-20 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 139159. gtbercea added a comment. Fix test. Repository: rC Clang https://reviews.llvm.org/D44541 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp test/OpenMP/nvptx_data_sharing.cpp

[PATCH] D44588: [OpenMP][Clang] Pass global thread ID to outlined function

2018-03-19 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea abandoned this revision. gtbercea added a comment. After some internal discussion with @ABataev he is going to replace the manual computation of the thread ID with a call to the runtime in a new patch. Repository: rC Clang https://reviews.llvm.org/D44588

[PATCH] D44588: [OpenMP][Clang] Pass global thread ID to outlined function

2018-03-16 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, grokos, carlo.bertolli, caomhin. Herald added subscribers: cfe-commits, guansong, jholewinski. The data sharing wrapper function needs to pass a valid global thread ID to the parallel outlined function when the parallel is

[PATCH] D44541: [OpenMP][Clang] Move device global stack init before master-workers split

2018-03-15 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, carlo.bertolli, grokos, caomhin. Herald added subscribers: cfe-commits, guansong, jholewinski. This patch moves the call to the stack init data sharing function before the splitting of threads into master and workers. This

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138275. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138274. gtbercea added a comment. - Revert - Add back. - Improve tests. - Add bclib. - Fix. - Fix. Repository: rC Clang https://reviews.llvm.org/D43197 Files: test/Driver/openmp-offload-gpu.c Index: test/Driver/openmp-offload-gpu.c

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138266. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138265. gtbercea added a comment. Update patch manually. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138262. gtbercea added a comment. Test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: test/Driver/openmp-offload-gpu.c Index: test/Driver/openmp-offload-gpu.c === ---

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138261. gtbercea added a comment. Add bclib. Repository: rC Clang https://reviews.llvm.org/D43197 Files: test/Driver/openmp-offload-gpu.c Index: test/Driver/openmp-offload-gpu.c ===

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138260. gtbercea added a comment. Improve test robustness for the case when CUDA libdevice cannot be found. Check that the warning is not emitted when the bc lib is found. Repository: rC Clang https://reviews.llvm.org/D43197 Files:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138242. gtbercea added a comment. Address comments. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138002. gtbercea added a comment. Add input file. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138001. gtbercea added a comment. Fixes. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 138000. gtbercea added a comment. Rename folder. Fix test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137999. gtbercea added a comment. Herald added a subscriber: jholewinski. Change name of folder. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/CodeGen/CGDecl.cpp

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: test/Driver/openmp-offload-gpu.c:150 +/// bitcode library and add it to the LIBRARY_PATH. +// RUN: touch %T/libomptarget-nvptx-sm_60.bc +// RUN: env LIBRARY_PATH=%T %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137769. gtbercea added a comment. Fix test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/lib/libomptarget-nvptx-sm_20.bc

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: test/Driver/openmp-offload-gpu.c:150 +/// bitcode library and add it to the LIBRARY_PATH. +// RUN: touch %T/libomptarget-nvptx-sm_60.bc +// RUN: env LIBRARY_PATH=%T %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137755. gtbercea added a comment. Revert to c_str(). Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/lib/libomptarget-nvptx-sm_60.bc

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:592 +Twine("lib") + CLANG_LIBDIR_SUFFIX); +LibraryPaths.emplace_back(DefaultLibPath.c_str()); + ABataev wrote: > Do you still need `.c_str()` here? Doesn't compile without it

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137754. gtbercea added a comment. Change test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/Inputs/lib/libomptarget-nvptx-sm_60.bc

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-09 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea marked an inline comment as done. gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:536-542 + StringRef CompilerPath = env; + while (!CompilerPath.empty()) { +std::pair Split = +

[PATCH] D43660: [OpenMP] Add OpenMP data sharing infrastructure using global memory

2018-03-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137600. gtbercea added a comment. Patch splitting: limit support in this patch to standalone target regions only. Support for combined directives will be fully covered in a subsequent patch. Repository: rC Clang https://reviews.llvm.org/D43660 Files:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:536-542 + StringRef CompilerPath = env; + while (!CompilerPath.empty()) { +std::pair Split = +CompilerPath.split(llvm::sys::EnvPathSeparator); +

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137233. gtbercea added a comment. - Fix message and test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137230. gtbercea added a comment. Fix test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137226. gtbercea added a comment. Address comments. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137219. gtbercea added a comment. Address comments. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43660: [OpenMP] Add OpenMP data sharing infrastructure using global memory

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137210. gtbercea added a comment. Add init stack function. Repository: rC Clang https://reviews.llvm.org/D43660 Files: lib/CodeGen/CGDecl.cpp lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-06 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 137203. gtbercea added a comment. Address comments. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-05 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:536-542 + StringRef CompilerPath = env; + while (!CompilerPath.empty()) { +std::pair Split = +CompilerPath.split(llvm::sys::EnvPathSeparator); +

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-03-05 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D43197#1011256, @Hahnfeld wrote: > Looking more closely at the patch, this doesn't seem to look into the `lib` / > `lib64` next to the compiler. I'm not sure if `LIBRARY_PATH` is set for every > installation, so I think we should add this

[PATCH] D43625: [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory

2018-03-01 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 136570. gtbercea added a comment. Add Source location. Repository: rC Clang https://reviews.llvm.org/D43625 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGOpenMPRuntimeNVPTX.h test/OpenMP/nvptx_data_sharing.cpp

[PATCH] D43660: [OpenMP] Add OpenMP data sharing infrastructure using global memory

2018-03-01 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 136528. Repository: rC Clang https://reviews.llvm.org/D43660 Files: lib/CodeGen/CGDecl.cpp lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGOpenMPRuntimeNVPTX.h

[PATCH] D43660: [OpenMP] Add OpenMP data sharing infrastructure using global memory

2018-02-22 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld. Herald added subscribers: cfe-commits, guansong, jholewinski. This patch handles the Clang code generation phase for the OpenMP data sharing infrastructure. TODO: add a more detailed

[PATCH] D43625: [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory

2018-02-22 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, carlo.bertolli, caomhin. Herald added subscribers: cfe-commits, guansong, jholewinski. Remove this scheme for now since it will be covered by another more generic scheme using global memory. This code will be worked into an

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 134295. gtbercea added a comment. Fix test. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 134292. gtbercea added a comment. Revert. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D43197#1007918, @Hahnfeld wrote: > I'm still not sure we can't run this test on Windows. I think lots of other > tests use `touch`, even some specific to Windows... Let me know what you'd like me to do. I can add the test back. I do see

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 134278. gtbercea added a comment. Use %T. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: test/Driver/unix-openmp-offload-gpu.c:15 +/// bitcode library that will be found via the LIBRARY_PATH. +// RUN: touch /tmp/libomptarget-nvptx-sm_60.bc +// RUN: env LIBRARY_PATH=/tmp Hahnfeld wrote: > Hahnfeld

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 134238. gtbercea added a comment. Fix tmp folder name. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea marked an inline comment as done. gtbercea added inline comments. Comment at: test/Driver/openmp-offload-gpu.c:150 +/// bitcode library that will be found via the LIBRARY_PATH. +// RUN: touch /tmp/libomptarget-nvptx-sm_60.bc +// RUN: LIBRARY_PATH=/tmp %clang -###

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-14 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 134235. gtbercea added a comment. Move unix specific test to new file. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 133919. gtbercea added a comment. Add regression tests. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp test/Driver/openmp-offload-gpu.c Index:

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 133882. gtbercea added a comment. Fix warning message. Repository: rC Clang https://reviews.llvm.org/D43197 Files: include/clang/Basic/DiagnosticDriverKinds.td lib/Driver/ToolChains/Cuda.cpp Index: lib/Driver/ToolChains/Cuda.cpp

[PATCH] D43197: [OpenMP] Add flag for linking runtime bitcode library

2018-02-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos. Herald added subscribers: cfe-commits, guansong. This patch adds an additional flag to the OpenMP device offloading toolchain to link in the runtime library bitcode. Repository: rC

[PATCH] D42841: [docs] Improve help for OpenMP options

2018-02-08 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea accepted this revision. gtbercea added a comment. LG https://reviews.llvm.org/D42841 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D41486: [OpenMP][Clang] Add missing argument to runtime functions.

2018-01-03 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea abandoned this revision. gtbercea added a comment. Functionality already landed. See previous comment. Repository: rL LLVM https://reviews.llvm.org/D41486 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend

2017-12-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea closed this revision. gtbercea added a comment. Committed here https://reviews.llvm.org/D41123 Repository: rL LLVM https://reviews.llvm.org/D40451 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D41486: [OpenMP][Clang] Add missing argument to runtime functions.

2017-12-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 127865. gtbercea added a comment. Address comments. Repository: rL LLVM https://reviews.llvm.org/D41486 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp test/OpenMP/nvptx_data_sharing.cpp test/OpenMP/nvptx_target_teams_codegen.cpp Index:

[PATCH] D41486: [OpenMP][Clang] Add missing argument to runtime functions.

2017-12-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D41486#961981, @Hahnfeld wrote: > https://reviews.llvm.org/D41012? This patch doesn't update the documentation > with function signatures. Ok so I see that your patch uses a different order of the arguments. I've just added the data

[PATCH] D41486: [OpenMP][Clang] Add missing argument to runtime functions.

2017-12-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: ABataev, carlo.bertolli, hfinkel, Hahnfeld, caomhin. Herald added a subscriber: jholewinski. This patch adds a missing argument to the runtime interface. Tests are adjusted accordingly. Repository: rL LLVM

[PATCH] D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget

2017-12-21 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: carlo.bertolli, ABataev, Hahnfeld, grokos, caomhin, hfinkel. This patch extends the libomptarget functionality in patch https://reviews.llvm.org/D14254 with support for the data sharing scheme for supporting implicitly shared variables.

[PATCH] D41123: [OpenMP] Add function attribute for triggering data sharing.

2017-12-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 126594. gtbercea added a comment. Fix test. Repository: rL LLVM https://reviews.llvm.org/D41123 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp test/OpenMP/nvptx_data_sharing.cpp Index: test/OpenMP/nvptx_data_sharing.cpp

[PATCH] D41123: [OpenMP] Add function attribute for triggering data sharing.

2017-12-12 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. gtbercea added reviewers: hfinkel, Hahnfeld, ABataev, carlo.bertolli, caomhin. Herald added a subscriber: jholewinski. The backend should only emit data sharing code for the cases where it is needed. A new function attribute is used by Clang to enable data sharing

[PATCH] D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend

2017-11-24 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 124246. Repository: rL LLVM https://reviews.llvm.org/D40451 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp test/OpenMP/nvptx_data_sharing.cpp Index: test/OpenMP/nvptx_data_sharing.cpp ===

[PATCH] D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend

2017-11-24 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. Herald added a subscriber: jholewinski. Since OpenMP and CUDA share the same toolchain we need to disable: - the lowering of variables to shared memory in the LLVM NVPTX backend - the emission of the shared depot - the emission of shared stack pointers when

[PATCH] D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory

2017-11-24 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 124243. gtbercea added a comment. Add regression tests and allow for shared memory lowering to be disabled at function level. Repository: rL LLVM https://reviews.llvm.org/D38978 Files: include/llvm/CodeGen/TargetPassConfig.h

[PATCH] D40250: [OpenMP] Consistently use cubin extension for nvlink

2017-11-20 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea accepted this revision. gtbercea added a comment. This revision is now accepted and ready to land. LG https://reviews.llvm.org/D40250 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D40250: [OpenMP] Consistently use cubin extension for nvlink

2017-11-20 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:431 -SmallString<256> Name(II.getFilename()); -llvm::sys::path::replace_extension(Name, "cubin"); - -const char *CubinF = -C.addTempFile(C.getArgs().MakeArgString(Name)); +const

[PATCH] D40250: [OpenMP] Consistently use cubin extension for nvlink

2017-11-20 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:431 -SmallString<256> Name(II.getFilename()); -llvm::sys::path::replace_extension(Name, "cubin"); - -const char *CubinF = -C.addTempFile(C.getArgs().MakeArgString(Name)); +const

[PATCH] D40250: [OpenMP] Consistently use cubin extension for nvlink

2017-11-20 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Clang.cpp:5340 + +const ToolChain *CurTC = (); +if (const auto *OA = dyn_cast(JA.getInputs()[I])) { Please add a comment here describing what this entire code snippet is doing.

[PATCH] D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading

2017-11-03 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 121543. gtbercea edited the summary of this revision. gtbercea added a comment. Remove blocks. https://reviews.llvm.org/D38976 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGOpenMPRuntimeNVPTX.h test/OpenMP/nvptx_data_sharing.cpp

[PATCH] D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading

2017-11-03 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 121538. Repository: rL LLVM https://reviews.llvm.org/D38976 Files: lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp lib/CodeGen/CGOpenMPRuntimeNVPTX.h test/OpenMP/nvptx_data_sharing.cpp test/OpenMP/nvptx_parallel_codegen.cpp

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-18 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. In https://reviews.llvm.org/D39005#900226, @jlebar wrote: > > I'd be interested to get the ball rolling in regard to coming up with a fix > > for this. I see some suggestions in past patches. Some help/clarification > > would be much appreciated. > > Happy to help,

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-17 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. Hi Artem, Justin, I see that this patch is the same as the patch Arpith wanted to post a while back i.e. https://reviews.llvm.org/D17738. Was there a consensus regarding what the right thing to do is in this case? I'd be interested to get the ball rolling in regard

[PATCH] D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory

2017-10-17 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea updated this revision to Diff 119327. gtbercea added a comment. Eliminate variable and function name clean-up. That has been moved into a separate patch: https://reviews.llvm.org/D39005 Repository: rL LLVM https://reviews.llvm.org/D38978 Files:

[PATCH] D39005: [OpenMP] Clean up variable and function names for NVPTX backend

2017-10-17 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. Herald added a subscriber: jholewinski. Clean-up variable and function names. Repository: rL LLVM https://reviews.llvm.org/D39005 Files: lib/Target/NVPTX/NVPTXAssignValidGlobalNames.cpp Index: lib/Target/NVPTX/NVPTXAssignValidGlobalNames.cpp

[PATCH] D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory

2017-10-16 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. Herald added subscribers: mgorny, jholewinski. This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the

[PATCH] D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading

2017-10-16 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea created this revision. Herald added a subscriber: jholewinski. This patch is part of the development effort to add support in the current OpenMP GPU offloading implementation for implicitly sharing variables between a target region executed by the team master thread and the worker

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-16 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added a comment. LGTM https://reviews.llvm.org/D38883 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; gtbercea wrote: > tra wrote: > > gtbercea wrote: > > > gtbercea wrote: > > > > I would also like to keep the spirit of this code if not in this exact > > > > form at least

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; tra wrote: > gtbercea wrote: > > gtbercea wrote: > > > I would also like to keep the spirit of this code if not in this exact > > > form at least something that performs the

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; gtbercea wrote: > I would also like to keep the spirit of this code if not in this exact form > at least something that performs the same functionality. @tra what's your

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.cpp:170-182 -// This code prevents IsValid from being set when -// no libdevice has been found. -bool allEmpty = true; -std::string LibDeviceFile; -for (auto key : LibDeviceMap.keys()) { -

[PATCH] D38883: [CMake][OpenMP] Customize default offloading arch

2017-10-13 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
gtbercea added inline comments. Comment at: lib/Driver/ToolChains/Cuda.h:90 - } }; I would also like to keep the spirit of this code if not in this exact form at least something that performs the same functionality. https://reviews.llvm.org/D38883

<    1   2   3   4   5   6   7   >