Re: [PATCH] D18385: [CUDA] Simplify SemaCUDA/function-overload.cu test.

2016-03-23 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Small nit, LGTM otherwise. Comment at: test/SemaCUDA/function-overload.cu:66 @@ +65,3 @@ +__device__ int d() { return 8; } +// expected-note@-1 0+ {{'d' declared here}} +//

Re: [PATCH] D18380: [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default.

2016-03-22 Thread Artem Belevich via cfe-commits
We need tests to demonstrate that we pick correct function when we have mix of HD+H/D in the overload set. Existing tests only cover resolution of {HD,HD}, {H,H} {D,D} {H,D} sets On Tue, Mar 22, 2016 at 4:59 PM, Justin Lebar wrote: > jlebar added a comment. > > In

Re: [PATCH] D18380: [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default.

2016-03-22 Thread Artem Belevich via cfe-commits
tra added a comment. Now that H/D and HD cal all be in the same overload set, we'll also need additional tests in CodeGenCUDA/function-overload.cu for cases that now became legal. http://reviews.llvm.org/D18380 ___ cfe-commits mailing list

Re: [PATCH] D18170: [CUDA][OpenMP] Create generic offload toolchains

2016-03-22 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Action.h:79 @@ +78,3 @@ +OFFLOAD_None = 0x00, +OFFLOAD_CUDA = 0x01, + }; Nit: All-caps CUDA looks weird here. _Cuda may be better choice. If you can shorten the prefix that would be nice, too.

Re: [PATCH] D18328: [CUDA] Add option to mark most functions inside as host+device.

2016-03-21 Thread Artem Belevich via cfe-commits
tra added a comment. In http://reviews.llvm.org/D18328#379824, @rsmith wrote: > I would much prefer for us to, say, provide a header that wraps the > system one and does something like > > // > #pragma clang cuda_implicit_host_device { > #include_next > #pragma clang

Re: [PATCH] D18328: [CUDA] Add option to mark most functions inside as host+device.

2016-03-21 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. One minor question, LGTM otherwise. Comment at: lib/Sema/SemaCUDA.cpp:474 @@ +473,3 @@ + SourceLocation Loc = FD.getLocation(); + if (!SM.isInSystemHeader(Loc)) +return

Re: [PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

2016-03-10 Thread Artem Belevich via cfe-commits
There were ambiguities in overload resolution between vector types and their base types. I.e. if I had void foo(int); void foo(int3); then call foo(3) was ambiguous. It wasn't clear whether this extension is supposed to work in C++ at all. On Thu, Mar 10, 2016 at 4:05 PM, Hal Finkel

Re: [PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

2016-03-10 Thread Artem Belevich via cfe-commits
tra abandoned this revision. tra added a comment. Ugh. Found more problems with using vector types in C++. Abandoning the idea. http://reviews.llvm.org/D18051 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

2016-03-10 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:72 @@ -71,1 +71,3 @@ +#if defined(CUDA_VECTOR_TYPES) +// Prevent inclusion of CUDA's vector_types.h jlebar wrote: > Hm, this is a surprising (to me) way of controlling this

Re: [PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

2016-03-10 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 50341. tra marked an inline comment as done. tra added a comment. Removed unneeded struct attributes. http://reviews.llvm.org/D18051 Files: lib/Headers/CMakeLists.txt lib/Headers/__clang_cuda_runtime_wrapper.h lib/Headers/__clang_cuda_vector_types.h

[PATCH] D18051: [CUDA] Provide CUDA's vector types implemented using clang's vector extension.

2016-03-10 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. This provides substantial performance boost on some benchmarks (~25% on SHOC's FFT) due to vectorized loads/stores. Unfortunately existing CUDA headers and user code occasionally take pointer to

r262516 - Fixed test failure platforms with name mangling different from Linux.

2016-03-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 2 15:03:20 2016 New Revision: 262516 URL: http://llvm.org/viewvc/llvm-project?rev=262516=rev Log: Fixed test failure platforms with name mangling different from Linux. * Run cc with -triple x86_64-linux-gnu to make symbol mangling predictable. * Use temporary file as a

Re: [PATCH] D17780: [CUDA] Do not generate unnecessary runtime init code.

2016-03-02 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL262499: [CUDA] Do not generate unnecessary runtime init code. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D17780?vs=49539=49646#toc Repository: rL LLVM

r262498 - [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 2 12:28:50 2016 New Revision: 262498 URL: http://llvm.org/viewvc/llvm-project?rev=262498=rev Log: [CUDA] Emit host-side 'shadows' for device-side global variables ... and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use

r262499 - [CUDA] Do not generate unnecessary runtime init code.

2016-03-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Mar 2 12:28:53 2016 New Revision: 262499 URL: http://llvm.org/viewvc/llvm-project?rev=262499=rev Log: [CUDA] Do not generate unnecessary runtime init code. Differential Revision: http://reviews.llvm.org/D17780 Modified: cfe/trunk/lib/CodeGen/CGCUDANV.cpp

Re: [PATCH] D17779: [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-01 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 49561. tra marked 9 inline comments as done. tra added a comment. Addressed Justin's comments. http://reviews.llvm.org/D17779 Files: lib/CodeGen/CGCUDANV.cpp lib/CodeGen/CGCUDARuntime.h lib/CodeGen/CodeGenModule.cpp test/CodeGenCUDA/device-stub.cu

[PATCH] D17780: [CUDA] Do not generate unnecessary runtime init code.

2016-03-01 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. Do not generate runtime init code if we don't have anything to init. http://reviews.llvm.org/D17780 Files: lib/CodeGen/CGCUDANV.cpp test/CodeGenCUDA/device-stub.cu Index:

[PATCH] D17779: [CUDA] Emit host-side 'shadows' for device-side global variables

2016-03-01 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue. tra added a subscriber: cfe-commits. .. and register them with CUDA runtime. This is needed for commonly used cudaMemcpy*() APIs that use address of host-side shadow to access their counterparts on device side. Fixes PR26340.

r261778 - [CUDA] do not allow attribute-based overloading for __global__ functions.

2016-02-24 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Feb 24 15:54:45 2016 New Revision: 261778 URL: http://llvm.org/viewvc/llvm-project?rev=261778=rev Log: [CUDA] do not allow attribute-based overloading for __global__ functions. __global__ functions are present on both host and device side, so providing __host__ or

Re: [PATCH] D17561: [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.

2016-02-24 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. OK. http://reviews.llvm.org/D17561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D17562: [CUDA] Add hack so code which includes "curand.h" doesn't break.

2016-02-24 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. OK. http://reviews.llvm.org/D17562 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D17561: [CUDA] Add conversion operators for threadIdx, blockIdx, gridDim, and blockDim to uint3 and dim3.

2016-02-24 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Headers/cuda_builtin_vars.h:72 @@ -66,1 +71,3 @@ + // uint3). This function is defined after we pull in vector_types.h. + __attribute__((device)) operator uint3() const; private: Considering that built-in variables

Re: [PATCH] D17111: [CUDA] Added --cuda-noopt-device-debug option to control ptxas' debug info generation.

2016-02-16 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL261018: [CUDA] pass debug options to ptxas. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D17111?vs=47680=48108#toc Repository: rL LLVM http://reviews.llvm.org/D17111 Files:

r261018 - [CUDA] pass debug options to ptxas.

2016-02-16 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Feb 16 16:03:20 2016 New Revision: 261018 URL: http://llvm.org/viewvc/llvm-project?rev=261018=rev Log: [CUDA] pass debug options to ptxas. ptxas optimizations are disabled if we need to generate debug info as ptxas does not accept '-g' otherwise. Differential Revision:

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-12 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL260697: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D16870?vs=47753=47819#toc Repository: rL LLVM

r260697 - [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-12 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Feb 12 12:29:18 2016 New Revision: 260697 URL: http://llvm.org/viewvc/llvm-project?rev=260697=rev Log: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior. This is an artefact of split-mode CUDA compilation that we need to mimic. HD functions are

r260719 - Added missing '__'.

2016-02-12 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Feb 12 14:26:43 2016 New Revision: 260719 URL: http://llvm.org/viewvc/llvm-project?rev=260719=rev Log: Added missing '__'. Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h URL:

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-11 Thread Artem Belevich via cfe-commits
tra requested a review of this revision. tra added a comment. This revision is now accepted and ready to land. @jingyue, @jlebar: can you take a look at the updated version? http://reviews.llvm.org/D16870 ___ cfe-commits mailing list

Re: [PATCH] D17111: [CUDA] Added --cuda-noopt-device-debug option to control ptxas' debug info generation.

2016-02-11 Thread Artem Belevich via cfe-commits
tra retitled this revision from "[CUDA] pass debug options to ptxas." to "[CUDA] Added --cuda-noopt-device-debug option to control ptxas' debug info generation.". tra updated the summary for this revision. tra updated this revision to Diff 47680. tra added a comment. Added

Re: [PATCH] D17103: [CUDA] Don't crash when trying to printf a non-scalar object.

2016-02-10 Thread Artem Belevich via cfe-commits
tra added a comment. Erasing an argument would only complicate the problem. I guess for consistency we need to match clang's behavior for regular C++ code. For optimized builds it just seems to pass NULL pointer instead. http://reviews.llvm.org/D17103

Re: [PATCH] D17111: [CUDA] pass debug options to ptxas.

2016-02-10 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Tools.cpp:10707 @@ +10706,3 @@ +// ptxas does not accept -g option if optimization is enabled, so we ignore +// compiler's -O* options if we want debug info. +CmdArgs.push_back("-g"); hfinkel wrote: >

[PATCH] D17111: [CUDA] pass debug options to ptxas.

2016-02-10 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, echristo. tra added a subscriber: cfe-commits. ptxas optimizations are disabled if we need to generate debug info as ptxas does not accept '-g' otherwise. http://reviews.llvm.org/D17111 Files: lib/Driver/Tools.cpp

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-09 Thread Artem Belevich via cfe-commits
tra updated the summary for this revision. tra updated this revision to Diff 47335. tra marked 3 inline comments as done. tra added a comment. Updated the way WrongSide functions are removed from consideration during overload resolution. Previous version could provide inconsistent results

Re: [PATCH] D17056: Mark all CUDA device-side function defs and decls as convergent.

2016-02-09 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CodeGenModule.cpp:1880 @@ +1879,3 @@ +// Conservatively, mark all functions in CUDA as convergent (meaning, they +// may call an intrinsicly convergent op, such as __syncthreads(), and so +// can't have certain

Re: [PATCH] D16932: [CUDA] Bug 26497 : Remove wrappers for variants already provided by CUDA headers.

2016-02-05 Thread Artem Belevich via cfe-commits
tra added a comment. I'm not sure what we could test here without CUDA headers. I've tested out-of-tree by compiling thrust unit tests and the test case in PR. http://reviews.llvm.org/D16932 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D16932: [CUDA] Bug 26497 : Remove wrappers for variants already provided by CUDA headers.

2016-02-05 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jingyue, jlebar. tra added a subscriber: cfe-commits. ... and pull them into std namespace with using-declaration instead. http://reviews.llvm.org/D16932 Files: lib/Headers/__clang_cuda_cmath.h Index: lib/Headers/__clang_cuda_cmath.h

r259944 - [CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers.

2016-02-05 Thread Artem Belevich via cfe-commits
Author: tra Date: Fri Feb 5 16:54:05 2016 New Revision: 259944 URL: http://llvm.org/viewvc/llvm-project?rev=259944=rev Log: [CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers. ... and pull global-scope ones into std namespace with using-declaration. Differential Revision:

Re: [PATCH] D16932: [CUDA] Bug 26497 : Remove wrappers for variants already provided by CUDA headers.

2016-02-05 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL259944: [CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D16932?vs=47040=47058#toc Repository: rL LLVM

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-04 Thread Artem Belevich via cfe-commits
tra marked an inline comment as done. Comment at: lib/Sema/SemaCUDA.cpp:132-141 @@ -131,12 +131,12 @@ // (d) HostDevice behavior depends on compilation mode. if (CallerTarget == CFT_HostDevice) { // Calling a function that matches compilation mode is OK. //

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-04 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46927. tra marked an inline comment as done. tra added a comment. Addressed Jingyue's comments. Fixed function-overload.cu tests to reflect stricter call target checks. http://reviews.llvm.org/D16870 Files: include/clang/Sema/Sema.h lib/Sema/SemaCUDA.cpp

Re: [PATCH] D16638: [CUDA] Added device-side system call decls and related wrappers.

2016-02-03 Thread Artem Belevich via cfe-commits
tra closed this revision. tra added a comment. Committed in r259690 http://reviews.llvm.org/D16638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

r259690 - [CUDA] added declarations for device-side system calls

2016-02-03 Thread Artem Belevich via cfe-commits
Author: tra Date: Wed Feb 3 14:53:58 2016 New Revision: 259690 URL: http://llvm.org/viewvc/llvm-project?rev=259690=rev Log: [CUDA] added declarations for device-side system calls ...and std:: wrappers for free/malloc. Modified: cfe/trunk/lib/Headers/__clang_cuda_runtime_wrapper.h

Re: [PATCH] D16638: [CUDA] Added device-side system call decls and related wrappers.

2016-02-03 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46818. tra added a comment. Updated comment. http://reviews.llvm.org/D16638 Files: lib/Headers/__clang_cuda_runtime_wrapper.h Index: lib/Headers/__clang_cuda_runtime_wrapper.h === ---

[PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-03 Thread Artem Belevich via cfe-commits
tra created this revision. tra added reviewers: jlebar, jingyue, jpienaar, eliben. tra added a subscriber: cfe-commits. This is an artefact of split-mode CUDA compilation that we need to mimic. HD functions are sometimes allowed to call H or D functions. Due to split compilation mode device-side

Re: [PATCH] D16870: [CUDA] Tweak attribute-based overload resolution to match nvcc behavior.

2016-02-03 Thread Artem Belevich via cfe-commits
tra added a comment. When overload set contains h and HD functions that are otherwise equal for overload resolution, you want to be able to tell which one is better. http://reviews.llvm.org/D16870 ___ cfe-commits mailing list

Re: [PATCH] D16638: [CUDA] Added device-side system call decls and related wrappers.

2016-02-03 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Headers/__clang_cuda_runtime_wrapper.h:232 @@ +231,3 @@ +// Clang will convert printf into vprintf, but we still need +// device-side declaration for it. +__device__ int printf(const char *, ...); jlebar wrote: > I'd

Re: [PATCH] D16638: [CUDA] Added device-side system call decls and related wrappers.

2016-02-03 Thread Artem Belevich via cfe-commits
tra retitled this revision from "[CUDA] Added device-side std::{malloc/free}" to "[CUDA] Added device-side system call decls and related wrappers.". tra updated the summary for this revision. tra updated this revision to Diff 46803. tra marked 3 inline comments as done. tra added a comment.

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-02-02 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46696. tra marked 8 inline comments as done. tra added a comment. Addressed Richard's comments. Relaxed restrictions a bit to allow constant initializers even those CUDA would not considered to be empty. Updated test case accordingly.

r259592 - [CUDA] Do not allow dynamic initialization of global device side variables.

2016-02-02 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Feb 2 16:29:48 2016 New Revision: 259592 URL: http://llvm.org/viewvc/llvm-project?rev=259592=rev Log: [CUDA] Do not allow dynamic initialization of global device side variables. In general CUDA does not allow dynamic initialization of global device-side variables. One

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-02-02 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL259592: [CUDA] Do not allow dynamic initialization of global device side variables. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D15305?vs=46696=46707#toc Repository: rL LLVM

Re: [PATCH] D16638: [CUDA] Added device-side std::{malloc/free}

2016-02-02 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46729. tra added a comment. Added few more device-side system calls and related wrapper functions. Added nothrow attributes on malloc/free. http://reviews.llvm.org/D16638 Files: lib/Headers/__clang_cuda_runtime_wrapper.h Index:

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-02-02 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Sema/SemaCUDA.cpp:429-430 @@ +428,4 @@ + CXXConstructorDecl *CD) { + if (!CD->isDefined() && CD->isTemplateInstantiation()) +InstantiateFunctionDefinition(VarLoc, CD->getFirstDecl()); +

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-02-01 Thread Artem Belevich via cfe-commits
Richard, On Fri, Jan 15, 2016 at 5:32 PM, Richard Smith wrote: > On Fri, Jan 15, 2016 at 5:29 PM, Richard Smith > wrote: > > On Fri, Jan 15, 2016 at 4:22 PM, Artem Belevich wrote: > >> tra added inline comments. > >> > >>

Re: [PATCH] D16514: Add -stop-on-failure driver option, and enable it by default for CUDA compiles.

2016-01-28 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Options.td:1807 @@ +1806,3 @@ +"CUDA compilation without --save-temps.">; +def nostop_on_failure : Flag<["-"], "nostop-on-failure">, Flags<[DriverOption]>; + I'd use 'no-' prefix.

Re: [PATCH] D16514: Add -stop-on-failure driver option, and enable it by default for CUDA compiles.

2016-01-28 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: lib/Driver/Driver.cpp:652 @@ -640,3 +651,3 @@ SmallVector, 4> FailingCommands; - C.ExecuteJobs(C.getJobs(), FailingCommands);

Re: [PATCH] D16514: Add -stop-on-failure driver option, and enable it by default for CUDA compiles.

2016-01-28 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Driver.cpp:652 @@ -640,3 +651,3 @@ SmallVector, 4> FailingCommands; - C.ExecuteJobs(C.getJobs(), FailingCommands); + C.ExecuteJobs(C.getJobs(), /* StopOnFailure = */ false, FailingCommands);

[PATCH] D16638: [CUDA] Added device-side std::{malloc/free}

2016-01-27 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. In addition to math functions, we also need to support std::malloc and std::free to match NVCC behavior. http://reviews.llvm.org/D16638 Files: lib/Headers/__clang_cuda_cmath.h

Re: [PATCH] D16559: [CUDA] Add -fcuda-allow-variadic-functions.

2016-01-26 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. LGTM. http://reviews.llvm.org/D16559 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D16559: [CUDA] Add -fcuda-allow-variadic-functions.

2016-01-26 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: cfe/trunk/include/clang/Driver/CC1Options.td:681 @@ -680,1 +680,3 @@ HelpText<"Enable function overloads based on CUDA target attributes.">; +def fcuda_allow_variadic_functions : Flag<["-"], "fcuda-allow-variadic-functions">, +

[PATCH] D16593: [CUDA] Implemented device-side support for functions in .

2016-01-26 Thread Artem Belevich via cfe-commits
tra created this revision. tra added a reviewer: jlebar. tra added a subscriber: cfe-commits. CUDA expects math functions in std:: namespace to work on device side. In order to make it work with clang without allowing device-side code generation for functions w/o appropriate target attributes,

Re: [PATCH] D16593: [CUDA] Implemented device-side support for functions in .

2016-01-26 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46041. tra marked 3 inline comments as done. tra added a comment. Added missing :: http://reviews.llvm.org/D16593 Files: lib/Headers/CMakeLists.txt lib/Headers/__clang_cuda_cmath.h lib/Headers/__clang_cuda_runtime_wrapper.h Index:

Re: [PATCH] D16593: [CUDA] Implemented device-side support for functions in .

2016-01-26 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 46055. tra marked 6 inline comments as done. tra added a comment. Fixed few issues revealed by -Wdouble-promotion http://reviews.llvm.org/D16593 Files: lib/Headers/CMakeLists.txt lib/Headers/__clang_cuda_cmath.h lib/Headers/__clang_cuda_runtime_wrapper.h

r258880 - [CUDA] Implemented device-side support functions in .

2016-01-26 Thread Artem Belevich via cfe-commits
Author: tra Date: Tue Jan 26 17:37:29 2016 New Revision: 258880 URL: http://llvm.org/viewvc/llvm-project?rev=258880=rev Log: [CUDA] Implemented device-side support functions in . CUDA expects math functions in std:: namespace to work on device side. In order to make it work with clang without

Re: [PATCH] D16593: [CUDA] Implemented device-side support for functions in .

2016-01-26 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL258880: [CUDA] Implemented device-side support functions in . (authored by tra). Changed prior to commit: http://reviews.llvm.org/D16593?vs=46055=46066#toc Repository: rL LLVM

Re: [PATCH] D16501: [CUDA] Don't generate aliases for static extern "C" functions.

2016-01-25 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. OK. If someone attempts to rely on this feature in CUDA it will be obviously broken due to the missing C-style mangled name. We can deal with it then. http://reviews.llvm.org/D16501

Re: [PATCH] D16499: [CUDA] Disable ctor/dtor aliases in device code.

2016-01-25 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D16499 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D16501: [CUDA] Don't generate aliases for static extern "C" functions.

2016-01-25 Thread Artem Belevich via cfe-commits
tra added a comment. Failing silently is not a good idea. At the very least there should produce an error. The right thing to do here, IMO, would be to generate a stub with alias name that just jumps to or calls aliasee. http://reviews.llvm.org/D16501

Re: [PATCH] D16484: [CUDA] Disallow variadic functions other than printf in device code.

2016-01-22 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Small nit. LGTM otherwise. Comment at: lib/Sema/SemaDecl.cpp:8291-8293 @@ +8290,5 @@ +// in device-side CUDA code. +if (NewFD->isVariadic() && (NewFD->hasAttr() || +

Re: [PATCH] D16307: [CUDA] Handle -O options (more) correctly.

2016-01-19 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D16307 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D16261: [CUDA] Only allow __global__ on free functions and static member functions.

2016-01-19 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Basic/DiagnosticSemaKinds.td:6418 @@ +6417,3 @@ +def warn_nvcc_compat_kern_is_method : Warning< + "kernel function %0 is a member function; this may not be accepted by nvcc">, + InGroup; There's an

Re: [PATCH] D16261: [CUDA] Only allow __global__ on free functions and static member functions.

2016-01-19 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Small diags not. LGTM otherwise. Comment at: lib/Sema/SemaDeclAttr.cpp:3620-3629 @@ -3619,2 +3619,12 @@ } + if (const auto *Method = dyn_cast(FD)) { +if

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-19 Thread Artem Belevich via cfe-commits
tra marked 3 inline comments as done. Comment at: lib/Sema/SemaCUDA.cpp:436 @@ +435,3 @@ + if (CD->isTrivial()) +return true; + jlebar wrote: > The test passes if I comment out this if statement. I'm not sure if that's > expected; this may or may not be

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-19 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 45312. tra marked 2 inline comments as done. tra added a comment. Addressed Justin's comments. http://reviews.llvm.org/D15305 Files: include/clang/Basic/DiagnosticSemaKinds.td include/clang/Sema/Sema.h lib/CodeGen/CGDeclCXX.cpp

Re: [PATCH] D16261: [CUDA] Only allow __global__ on free functions and static member functions.

2016-01-19 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Sema/SemaDeclAttr.cpp:3620-3629 @@ -3619,2 +3619,12 @@ } + if (const auto *Method = dyn_cast(FD)) { +if (Method->isInstance()) { + S.Diag(Method->getLocStart(), diag::err_kern_is_nonstatic_method) + << Method; +

Re: [PATCH] D16331: [CUDA] Bail, rather than crash, on va_arg in device code.

2016-01-19 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Small nit. LGTM otherwise. Comment at: lib/Sema/SemaExpr.cpp:11732 @@ +11731,3 @@ + // CUDA device code does not support varargs. + if (getLangOpts().CUDAIsDevice) { +if

Re: [PATCH] D16250: Respect bound archs, even when they don't alter the toolchain.

2016-01-15 Thread Artem Belevich via cfe-commits
tra added a comment. Looks OK to me. Perhaps BoundArch w/o toolchain is sufficient for the key as toolchain would be derived from it. http://reviews.llvm.org/D16250 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-15 Thread Artem Belevich via cfe-commits
tra added a reviewer: jlebar. tra updated this revision to Diff 45044. tra added a comment. Moved initializer checks from CodeGen to Sema. Added test cases for initializers of non-class variables. http://reviews.llvm.org/D15305 Files: include/clang/Basic/DiagnosticSemaKinds.td

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-15 Thread Artem Belevich via cfe-commits
tra marked an inline comment as done. tra added a comment. In http://reviews.llvm.org/D15305#327226, @rsmith wrote: > I think you missed this from my previous review: > > > This should be checked and diagnosed in Sema, not in CodeGen. > Done. http://reviews.llvm.org/D15305

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-15 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 45051. tra marked an inline comment as done. tra added a comment. Typo fix. http://reviews.llvm.org/D15305 Files: include/clang/Basic/DiagnosticSemaKinds.td include/clang/Sema/Sema.h lib/CodeGen/CGDeclCXX.cpp lib/CodeGen/CodeGenModule.cpp

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-15 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CodeGenModule.cpp:2334 @@ -2339,1 +2333,3 @@ + D->hasAttr()) Init = llvm::UndefValue::get(getTypes().ConvertType(ASTTy)); + else if (!InitExpr) { rsmith wrote: > As this is a global variable, it should

Re: [PATCH] D16080: [CUDA] Add tests for compiling CUDA files with -E.

2016-01-13 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: test/Driver/cuda-preprocess.cu:13-16 @@ +12,6 @@ + +// RUN: %clang -E -target x86_64-linux-gnu --cuda-gpu-arch=sm_20 %s 2>&1 \ +// RUN: | FileCheck -check-prefix NOARCH %s +// RUN: %clang -E -target x86_64-linux-gnu --cuda-gpu-arch=sm_20

Re: [PATCH] D16081: [CUDA] Add test for compiling CUDA code with -S.

2016-01-13 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D16081 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-12 Thread Artem Belevich via cfe-commits
tra updated this revision to Diff 44687. tra added a comment. Check all variable initializers and only allow 'empty constructors' as Richard has suggested. Changed test structure so that we test for allowed/disallowed constructors separately from testing how we handle initialization of base

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-12 Thread Artem Belevich via cfe-commits
tra added a comment. Richard, I've updated the patch as you've suggested -- it indeed simplifies things quite a bit and handles the corner cases you've mentioned. Comment at: lib/CodeGen/CGDeclCXX.cpp:323-324 @@ +322,4 @@ + + // The constructor function has no parameters, +

Re: [PATCH] D16082: [CUDA] Invoke ptxas and fatbinary during compilation.

2016-01-11 Thread Artem Belevich via cfe-commits
tra added a comment. Make sure it works with -save-temps and -fintegrated-as/-fno-integrated-as. They tend to throw wrenches into pipeline construction. Comment at: lib/Driver/Driver.cpp:1380 @@ +1379,3 @@ + C.MakeAction(DeviceActions, types::TY_CUDA_FATBIN), + /*

Re: [PATCH] D15911: Move ownership of Action objects into Compilation.

2016-01-08 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Action.h:36 @@ -35,1 +35,3 @@ +/// +/// Actions are usually owned by a Compilation. class Action { There's no API to pass ownership to Compilation explicitly, so the only way for an Action to be owned

Re: [PATCH] D15974: [CUDA] Split out tests for unused-arg warnings from cuda-options.cu.

2016-01-07 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. http://reviews.llvm.org/D15974 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

Re: [PATCH] D15936: Update code in buildCudaActions and BuildActions to latest idiom.

2016-01-06 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM Comment at: lib/Driver/Driver.cpp:1300 @@ -1299,5 +1299,3 @@ for (Arg *A : Args) { -if (A->getOption().matches(options::OPT_cuda_gpu_arch_EQ)) { - A->claim(); -

Re: [PATCH] D15596: Add -L/path/to/cuda/lib if any of our inputs are CUDA files.

2016-01-06 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/Tools.cpp:276 @@ +275,3 @@ + + // Add -L/path/to/cuda/lib if any of our inputs are .cu files. + if (std::any_of(C.getActions().begin(), C.getActions().end(), isCudaAction)) It just struck me that this patch may

Re: [PATCH] D15596: Add -L/path/to/cuda/lib if any of our inputs are CUDA files.

2016-01-06 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. Minor nit, but looks good otherwise. Comment at: include/clang/Driver/Options.td:1636 @@ -1635,1 +1635,3 @@ +def nocudalib : Flag<["-"], "nocudalib">, + HelpText<"Don't link with

Re: [PATCH] D15596: Add -L/path/to/cuda/lib if any of our inputs are CUDA files.

2016-01-06 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: include/clang/Driver/Options.td:1636 @@ -1635,1 +1635,3 @@ +def nocudalib : Flag<["-"], "nocudalib">, + HelpText<"Don't include libraries necessary for running CUDA (-L/path/to/cuda/lib{,64} -lcudart_static -lrt -lpthread -ldl)">; def

Re: [PATCH] D15933: Rename -nocudalib to -nocudalibdevice.

2016-01-06 Thread Artem Belevich via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM http://reviews.llvm.org/D15933 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D15686: PR25910: clang allows two var definitions with the same mangled name

2016-01-05 Thread Artem Belevich via cfe-commits
tra added a comment. A better description of the problem would help. PR itself is somewhat short on details. If I understand it correctly, the problem is that if we create multiple definitions with the same mangled name, clang does not always report it as an error and only emits one of those

Re: r256854 - [OpenMP] Allow file ID to be signed in the offloading metadata.

2016-01-05 Thread Artem Belevich via cfe-commits
Samuel, The tests are still failing: http://lab.llvm.org:8011/builders/clang-bpf-build/builds/5759 On Tue, Jan 5, 2016 at 10:02 AM, Samuel Antao via cfe-commits < cfe-commits@lists.llvm.org> wrote: > Author: sfantao > Date: Tue Jan 5 12:02:24 2016 > New Revision: 256854 > > URL:

Re: [PATCH] D15596: Add -L/path/to/cuda/lib if any of our inputs are CUDA files.

2016-01-05 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/Driver/ToolChains.cpp:4125 @@ +4124,3 @@ + ArgStringList ) const { + if (DriverArgs.hasArg(options::OPT_nocudalib) || !CudaInstallation.isValid()) +return; I'd rename -nocudalib to

Re: [PATCH] D15305: [CUDA] Do not allow dynamic initialization of global device side variables.

2016-01-05 Thread Artem Belevich via cfe-commits
tra added a comment. ping. http://reviews.llvm.org/D15305 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Re: [PATCH] D15686: PR25910: clang allows two var definitions with the same mangled name

2016-01-04 Thread Artem Belevich via cfe-commits
tra added inline comments. Comment at: lib/CodeGen/CodeGenModule.cpp:1235-1236 @@ -1235,8 +1234,4 @@ // different type. -// FIXME: Support for variables is not implemented yet. -if (isa(D.getDecl())) - GV = cast(GetAddrOfGlobal(D, /*IsForDefinition=*/true)); -

r255911 - [CUDA] Make vtable construction aware of host/device side of CUDA compilation.

2015-12-17 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Dec 17 12:12:36 2015 New Revision: 255911 URL: http://llvm.org/viewvc/llvm-project?rev=255911=rev Log: [CUDA] Make vtable construction aware of host/device side of CUDA compilation. C++ emits vtables for classes that have key function present in the current TU. While we

Re: [PATCH] D15309: [CUDA] emit vtables only for classes with methods usable on this side of compilation.

2015-12-17 Thread Artem Belevich via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rL255911: [CUDA] Make vtable construction aware of host/device side of CUDA compilation. (authored by tra). Changed prior to commit: http://reviews.llvm.org/D15309?vs=42341=43150#toc Repository: rL

r255933 - [CUDA] runtime wrapper header tweaks

2015-12-17 Thread Artem Belevich via cfe-commits
Author: tra Date: Thu Dec 17 16:25:22 2015 New Revision: 255933 URL: http://llvm.org/viewvc/llvm-project?rev=255933=rev Log: [CUDA] runtime wrapper header tweaks * Pull in host-only implementations of few CUDA-specific math functions. * #nclude early to prevent its inclusion from CUDA headers

<    2   3   4   5   6   7   8   9   >