[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-12 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

Ah, too bad.  Is there any way to suppress that in debug info?  I'm not sure 
there's any other way to satisfy the competing requirements here, and if it's 
not going to be consistent, it'd be better to avoid the complexity of mangling 
the thunk differently.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-12 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77743#1975822 , @rjmccall wrote:

> Is the renaming just being done to avoid breakpoints from triggering in the 
> stub?  Can you not disable debugging the stub using whatever mechanism 
> `__attribute__((nodebug))` uses?


I tried it. The source info and line number is gone, but gdb will still break 
on the function since symbol is still there.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-11 Thread John McCall via Phabricator via cfe-commits
rjmccall added a comment.

Is the renaming just being done to avoid breakpoints from triggering in the 
stub?  Can you not disable debugging the stub using whatever mechanism 
`__attribute__((nodebug))` uses?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

The ambiguity issue is still there. That `__global__` function generates 
different code if it's compiled as HIP by clang or non-HIP code by clang or 
other compilers. That will break the resolving from the symbol value to its 
device kernel name.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77743#1972301 , @hliao wrote:

> In D77743#1972298 , @yaxunl wrote:
>
> > In D77743#1972292 , @hliao wrote:
> >
> > > In addition, we may also need to extend the registration to set up the 
> > > mapping from that global variable to the host side stub function. 
> > > `hipKernelLaunch` (implemented as a function call instead of the kernel 
> > > launch syntax) to call into that stub function to prepare the arguments.
> >
> >
> > hipKernelLaunch does not call the stub function. The stub function calls 
> > hipKernelLaunch. Therefore user/runtime does not need to know about stub 
> > function to launch a kernel.
>
>
> Since the code using hipKernelLuanch may be compiled by other compilers, we 
> cannot force reinterpreting the use of that symbol by loading value from the 
> symbol. For code like this
>
>   __global__ void foo();
>  
>   hipKernelLaunch(foo, ...)
>
>
> If it's compiled by other compiler, `foo` refers to the value of that symbol, 
> i.e. a constant, instead of the value loading from that symbol. They are 
> different.


Right. This will work. We don't need user to load from foo, because foo will 
resolve to kernel handle instead of stub function.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

In D77743#1972298 , @yaxunl wrote:

> In D77743#1972292 , @hliao wrote:
>
> > In addition, we may also need to extend the registration to set up the 
> > mapping from that global variable to the host side stub function. 
> > `hipKernelLaunch` (implemented as a function call instead of the kernel 
> > launch syntax) to call into that stub function to prepare the arguments.
>
>
> hipKernelLaunch does not call the stub function. The stub function calls 
> hipKernelLaunch. Therefore user/runtime does not need to know about stub 
> function to launch a kernel.


Since the code using hipKernelLuanch may be compiled by other compilers, we 
cannot force reinterpreting the use of that symbol by loading value from the 
symbol. For code like this

  __global__ void foo();
  
  hipKernelLaunch(foo, ...)

If it's compiled by other compiler, `foo` refers to the value of that symbol, 
i.e. a constant, instead of the value loading from that symbol. They are 
different.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77743#1972292 , @hliao wrote:

> In addition, we may also need to extend the registration to set up the 
> mapping from that global variable to the host side stub function. 
> `hipKernelLaunch` (implemented as a function call instead of the kernel 
> launch syntax) to call into that stub function to prepare the arguments.


hipKernelLaunch does not call the stub function. The stub function calls 
hipKernelLaunch. Therefore user/runtime does not need to know about stub 
function to launch a kernel.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Michael Liao via Phabricator via cfe-commits
hliao added a comment.

In addition, we may also need to extend the registration to set up the mapping 
from that global variable to the host side stub function. `hipKernelLaunch` 
(implemented as a function call instead of the kernel launch syntax) to call 
into that stub function to prepare the arguments.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77743#1972258 , @hliao wrote:

> In D77743#1970304 , @tra wrote:
>
> > In D77743#1970163 , @yaxunl wrote:
> >
> > > The kernel handle is a variable. Even if it has the same name as kernel, 
> > > it is OK for the debugger since the debugger does not put break point on 
> > > a variable.
> >
> >
> > The patch appears to apply only to generated kernels. What happens when we 
> > take address of the kernel directly?
> >
> >   a.hip: 
> >   __global__ void kernel() {}
> >  
> >   auto kernel_ref() {
> > return kernel;
> >   }
> >  
> >   b.hip:
> >   extern __global__ void kernel(); // access the handle var
> >   something kernel_ref(); // returns the stub pointer?
> >  
> >   void f() {
> > auto x = kernel_ref();
> > auto y = kernel(); 
> > hipLaunchKernel(x,...); // x is the stub pointer. 
> > hipLaunchKernel(y,...);
> >   }
> >
> >
> > Will `x` and `y` contain the same value? For CUDA the answer would be yes 
> > as they both would contain the address of the host-side stub with the 
> > kernel's name.
> >  In this case external reference will point to the handle variable, but I'm 
> > not sure what would kernel_ref() return. 
> >  My guess is that it will be the stub address, which may be a problem.  I 
> > may be wrong. It would be good to add a test to verify that we always get 
> > consistent results when we're referencing the kernel.
>
>
> That's a good question. That introduces the ambiguity on the values of the 
> same symbol (from the programmer point of view). To ensure we won't have 
> ambiguity, we should always use that *alias* global variable for `__global__` 
> function on the host side as it will be used in the runtime API to query the 
> device-side function.


I think I need to initialize the kernel handle with the address of the stub 
function. Any reference to the kernel in host code will use the kernel handle 
instead of stub function. When the stub function is called, if it is known at 
compile time, it will be called directly. If it is indirectly called, I will 
load the stub function from the kernel handle and call it.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-09 Thread Michael Liao via Phabricator via cfe-commits
hliao requested changes to this revision.
hliao added a comment.
This revision now requires changes to proceed.

In D77743#1970304 , @tra wrote:

> In D77743#1970163 , @yaxunl wrote:
>
> > The kernel handle is a variable. Even if it has the same name as kernel, it 
> > is OK for the debugger since the debugger does not put break point on a 
> > variable.
>
>
> The patch appears to apply only to generated kernels. What happens when we 
> take address of the kernel directly?
>
>   a.hip: 
>   __global__ void kernel() {}
>  
>   auto kernel_ref() {
> return kernel;
>   }
>  
>   b.hip:
>   extern __global__ void kernel(); // access the handle var
>   something kernel_ref(); // returns the stub pointer?
>  
>   void f() {
> auto x = kernel_ref();
> auto y = kernel(); 
> hipLaunchKernel(x,...); // x is the stub pointer. 
> hipLaunchKernel(y,...);
>   }
>
>
> Will `x` and `y` contain the same value? For CUDA the answer would be yes as 
> they both would contain the address of the host-side stub with the kernel's 
> name.
>  In this case external reference will point to the handle variable, but I'm 
> not sure what would kernel_ref() return. 
>  My guess is that it will be the stub address, which may be a problem.  I may 
> be wrong. It would be good to add a test to verify that we always get 
> consistent results when we're referencing the kernel.


That's a good question. That introduces the ambiguity on the values of the same 
symbol (from the programmer point of view). To ensure we won't there's no 
ambiguity, we should always use that *alias* global variable for `__global__` 
function on the host side as it will be used in the runtime API to query the 
device-side function.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

In D77743#1970163 , @yaxunl wrote:

> The kernel handle is a variable. Even if it has the same name as kernel, it 
> is OK for the debugger since the debugger does not put break point on a 
> variable.


The patch appears to apply only to generated kernels. What happens when we take 
address of the kernel directly?

  a.hip: 
  __global__ void kernel() {}
  
  auto kernel_ref() {
return kernel;
  }
  
  b.hip:
  extern __global__ void kernel(); // access the handle var
  something kernel_ref(); // returns the stub pointer?
  
  void f() {
auto x = kernel_ref();
auto y = kernel(); 
hipLaunchKernel(x,...); // x is the stub pointer. 
hipLaunchKernel(y,...);
  }

Will `x` and `y` contain the same value? For CUDA the answer would be yes as 
they both would contain the address of the host-side stub with the kernel's 
name.
In this case external reference will point to the handle variable, but I'm not 
sure what would kernel_ref() return. 
My guess is that it will be the stub address, which may be a problem.  I may be 
wrong. It would be good to add a test to verify that we always get consistent 
results when we're referencing the kernel.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D77743#1970035 , @tra wrote:

> Would not this scheme create a conflict between the device-side mangled 
> kernel name and the handle which we emit with the same name? I recall that 
> the distinct stub name was introduced specifically to avoid confusion between 
> device-side kernel and the host-side stub that were visible at the same time 
> (to debugger only?). Now we seen to re-introduce the same name only for the 
> host-side handle instead of the host-side stub.


we need the stub name to be different than the kernel name because otherwise 
the debugger will break on the stub function when the users put a break point 
on the kernel.

The kernel handle is a variable. Even if it has the same name as kernel, it is 
OK for the debugger since the debugger does not put break point on a variable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

Would not this scheme create a conflict between the device-side mangled kernel 
name and the handle which we emit with the same name? I recall that the 
distinct stub name was introduced specifically to avoid confusion between 
device-side kernel and the host-side stub that were visible at the same time 
(to debugger only?). Now we seen to re-introduce the same name only for the 
host-side handle instead of the host-side stub.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77743/new/

https://reviews.llvm.org/D77743



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77743: [HIP] Emit symbols with kernel name in host binary

2020-04-08 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision.
yaxunl added reviewers: tra, rjmccall.

HIP provide host API to allow C/C++ programs to
launch kernel. A C/C++ program can declare a HIP
kernel as an external function and pass it to
the kernel launching API. When linked with object
files built from HIP programs. These external functions
will resolve to symbols with the same name in HIP
programs so that kernels with the same name can be
found and launched.

This requires clang to emit symbols with the same
name as kernels in object files and use them to
identify kernels, instead of using device stub
functions to identify kernels, since device stub
function has different names than kernels.

This patch lets clang emits a void* type global
variable for each kernel in host IR, which is
called kernel handle. The kernel handle has the
same mangled name as kernel by host ABI. It is
passed to __hipRegisterFunction and kernel launching
functions for identifying kernels.


https://reviews.llvm.org/D77743

Files:
  clang/lib/CodeGen/CGCUDANV.cpp
  clang/test/CodeGenCUDA/Inputs/cuda.h
  clang/test/CodeGenCUDA/cxx-call-kernel.cpp
  clang/test/CodeGenCUDA/kernel-stub-name.cu
  clang/test/CodeGenCUDA/unnamed-types.cu

Index: clang/test/CodeGenCUDA/unnamed-types.cu
===
--- clang/test/CodeGenCUDA/unnamed-types.cu
+++ clang/test/CodeGenCUDA/unnamed-types.cu
@@ -36,4 +36,4 @@
   }(p);
 }
 // HOST: @__hip_register_globals
-// HOST: __hipRegisterFunction{{.*}}@_Z17__device_stub__k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_{{.*}}@0
+// HOST: __hipRegisterFunction{{.*}}@_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_{{.*}}@0
Index: clang/test/CodeGenCUDA/kernel-stub-name.cu
===
--- clang/test/CodeGenCUDA/kernel-stub-name.cu
+++ clang/test/CodeGenCUDA/kernel-stub-name.cu
@@ -6,6 +6,12 @@
 
 #include "Inputs/cuda.h"
 
+// Kernel handles
+
+// CHECK: @[[HCKERN:ckernel]] = constant i8* null
+// CHECK: @[[HNSKERN:_ZN2ns8nskernelEv]] = constant i8* null
+// CHECK: @[[HTKERN:_Z10kernelfuncIiEvv]] = linkonce_odr constant i8* null
+
 extern "C" __global__ void ckernel() {}
 
 namespace ns {
@@ -26,9 +32,9 @@
 // Non-template kernel stub functions
 
 // CHECK: define{{.*}}@[[CSTUB:__device_stub__ckernel]]
-// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[CSTUB]]
+// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[HCKERN]]
 // CHECK: define{{.*}}@[[NSSTUB:_ZN2ns23__device_stub__nskernelEv]]
-// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[NSSTUB]]
+// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[HNSKERN]]
 
 // CHECK-LABEL: define{{.*}}@_Z8hostfuncv()
 // CHECK: call void @[[CSTUB]]()
@@ -45,11 +51,11 @@
 // Template kernel stub functions
 
 // CHECK: define{{.*}}@[[TSTUB]]
-// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[TSTUB]]
+// CHECK: call{{.*}}@hipLaunchByPtr{{.*}}@[[HTKERN]]
 
 // CHECK: declare{{.*}}@[[DSTUB]]
 
 // CHECK-LABEL: define{{.*}}@__hip_register_globals
-// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[CSTUB]]{{.*}}@[[CKERN]]
-// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[NSSTUB]]{{.*}}@[[NSKERN]]
-// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[TSTUB]]{{.*}}@[[TKERN]]
+// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[HCKERN]]{{.*}}@[[CKERN]]
+// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[HNSKERN]]{{.*}}@[[NSKERN]]
+// CHECK: call{{.*}}@__hipRegisterFunction{{.*}}@[[HTKERN]]{{.*}}@[[TKERN]]
Index: clang/test/CodeGenCUDA/cxx-call-kernel.cpp
===
--- /dev/null
+++ clang/test/CodeGenCUDA/cxx-call-kernel.cpp
@@ -0,0 +1,19 @@
+// RUN: %clang_cc1 -x hip -emit-llvm-bc %s -o %t.hip.bc
+// RUN: %clang_cc1 -mlink-builtin-bitcode %t.hip.bc -DHIP_PLATFORM -emit-llvm \
+// RUN:   %s -o - | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+// CHECK: @_Z2g1i = internal constant i8* null
+#if __HIP__
+__global__ void g1(int x) {}
+#else
+extern void g1(int x);
+
+// CHECK: call i32 @hipLaunchKernel{{.*}}@_Z2g1i
+void test() {
+  hipLaunchKernel((void*)g1, 1, 1, nullptr, 0, 0);
+}
+
+// CHECK: __hipRegisterFunction{{.*}}@_Z2g1i
+#endif
Index: clang/test/CodeGenCUDA/Inputs/cuda.h
===
--- clang/test/CodeGenCUDA/Inputs/cuda.h
+++ clang/test/CodeGenCUDA/Inputs/cuda.h
@@ -2,19 +2,28 @@
 
 #include 
 
+#if __HIP__ || __CUDA__
 #define __constant__ __attribute__((constant))
 #define __device__ __attribute__((device))
 #define __global__ __attribute__((global))
 #define __host__ __attribute__((host))
 #define __shared__ __attribute__((shared))
 #define __launch_bounds__(...) __attribute__((launch_bounds(__VA_ARGS__)))
+#else
+#define __constant__
+#define __device__
+#define __global__
+#define __host__
+#define __shared__
+#define __launch_bounds__(...)
+#endif
 
 struct dim3 {
   unsigned x, y, z;
   __host__ __device__ dim3(unsigned x, unsigned y = 1, unsigned z = 1) : x(x), y(y), z(z) {}
 };
 
-#ifdef __HIP__
+#if __HIP__ ||