Re: [PATCH] End-to-end CUDA compilation.

Artem Belevich Tue, 05 May 2015 15:06:12 -0700

================
Comment at: lib/CodeGen/CGCUDANV.cpp:164
@@ +163,3 @@
+/// Creates internal function to register all kernel stubs generated in this
+/// module with CUDA runtime.
+/// \code
----------------
echristo wrote:
> "with the CUDA runtime".
Done.


================
Comment at: lib/CodeGen/CGCUDANV.cpp:166
@@ +165,3 @@
+/// \code
+/// void .cuda_register_kernels(void** GpuBinaryHandle) {
+///    __cudaRegisterFunction(GpuBinaryHandle,Kernel0,...);
----------------
echristo wrote:
> The function name begins with a .? Ugh.
Replaced with __

================
Comment at: lib/CodeGen/CGCUDANV.cpp:197-207
@@ +196,13 @@
+    llvm::Constant *NullPtr = llvm::ConstantPointerNull::get(VoidPtrTy);
+    llvm::Value *args[] = {
+        &GpuBinaryHandlePtr,
+        Builder.CreateBitCast(Kernel, VoidPtrTy),
+        KernelName,
+        KernelName,
+        llvm::ConstantInt::get(IntTy, -1),
+        NullPtr,
+        NullPtr,
+        NullPtr,
+        NullPtr,
+        llvm::ConstantPointerNull::get(IntTy->getPointerTo())};
+    Builder.CreateCall(RegisterFunc, args);
----------------
echristo wrote:
> clang-format?
Done. I've also replaced last argument with a plain NullPtr.

================
Comment at: lib/Driver/Driver.cpp:183
@@ -182,3 +182,3 @@
 
-    // -c only runs up to the assembler.
-  } else if ((PhaseArg = DAL.getLastArg(options::OPT_c))) {
+    // -c and partial CUDA compilations only runs up to the assembler.
+  } else if ((PhaseArg = DAL.getLastArg(options::OPT_c)) ||
----------------
echristo wrote:
> "and partial CUDA compilations only run up"
Fixed.

================
Comment at: lib/Driver/Driver.cpp:1194
@@ +1193,3 @@
+  if (GpuArchList.empty())
+    GpuArchList.push_back("sm_20");
+
----------------
echristo wrote:
> Some comment on the default here.
Done.

================
Comment at: lib/Driver/Driver.cpp:1672-1674
@@ -1551,1 +1671,5 @@
 
+static llvm::Triple computeTargetTriple(StringRef DefaultTargetTriple,
+                                        const ArgList &Args,
+                                        StringRef DarwinArchName);
+
----------------
echristo wrote:
> Do you need the declaration up here? Why not just pull the static function up 
> if so?
That would clutter the changes for no good reason. Whenever bunch of code moved 
from one place to another, it's always a pain figuring out whether things were 
just copied or copied *and* changed. Forward declaration is a lesser crime, IMO.

================
Comment at: lib/Driver/Driver.cpp:1732
@@ +1731,3 @@
+        computeTargetTriple(DefaultTargetTriple, C.getArgs(), "");
+    llvm::Triple TargetTriple(HostTriple.isArch64Bit() ? "nvptx64-nvidia-cuda"
+                                                       : "nvptx-nvidia-cuda");
----------------
echristo wrote:
> Probably would prefer "DeviceTriple" here.
Done.

================
Comment at: lib/Driver/Tools.cpp:2583-2588
@@ -2577,1 +2582,8 @@
+  assert(Inputs.size() >= 1 && "Must have at least one input.");
+  InputInfoList BaseInputs; // Inputs[0]
+  const InputInfo &Input = Inputs[0];
+  BaseInputs.push_back(Input);
+  bool IsCuda = types::isCuda(Input.getType());
+
+  assert((IsCuda || Inputs.size() == 1) && "Unable to handle multiple 
inputs.");
 
----------------
echristo wrote:
> Comment about what's going on here.
Done.

================
Comment at: lib/Driver/Tools.cpp:2696
@@ -2683,3 +2695,3 @@
   CmdArgs.push_back("-main-file-name");
-  CmdArgs.push_back(getBaseInputName(Args, Inputs));
+  CmdArgs.push_back(getBaseInputName(Args, Input));
 
----------------
echristo wrote:
> Might be nice to pull this sort of change out so it isn't affecting the rest 
> of the diff.
Sure. I was also thinking of splitting code generation into a separate commit 
as well as it's largely independent of the driver changes.

================
Comment at: test/CodeGenCUDA/device-stub.cu:40-46
@@ +39,9 @@
+// Test that we've built contructor..
+// CHECK: define internal void @.cuda_module_ctor
+//   .. that calls __cudaRegisterFatBinary(&.cuda_fatbin_wrapper)
+// CHECK: call{{.*}}cudaRegisterFatBinary{{.*}}.cuda_fatbin_wrapper
+//   .. stores return value in .cuda_gpubin_handle
+// CHECK: store{{.*}}.cuda_gpubin_handle
+//   .. and then calls .cuda_register_kernels
+// CHECK: call void @.cuda_register_kernels
+
----------------
echristo wrote:
> Should some of these be CHECK-NEXT?
Some. Changed to CHECK-NEXT where it was possible.

http://reviews.llvm.org/D8463

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] End-to-end CUDA compilation.

Reply via email to