[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

Yaxun Liu via Phabricator via cfe-commits Tue, 09 Nov 2021 14:19:59 -0800

yaxunl added a comment.

I think probably it is necessary to merge linkonce_odr symbols for them to work 
properly.


Consider the following testcase:

  // a.cu
  template<typename T>
  __global__ void foo(T x) {}
  
  void test1() {
      foo<<<1,1>>>(1);
  }
  
  // b.cu
  template<typename T>
  __global__ void foo(T x) {}
  
  void test2() {
      foo<<<1,1>>>(1);
  }
  
  // c.cu
  template<typename T>
  __global__ void foo(T x);
  
  int main() {
      foo<<<1,1>>>(1);
  }

Assume a.cu, b.cu, and c.cu are compiled with default -fno-gpu-rdc option and 
linked together.

Both a.obj and b.obj contain a global symbol foo<int> as the kernel stub 
function. c.obj contains reference to foo<int>, so it has to resolve to 
foo<int> in a.obj or b.obj. It only makes sense for linker to merge foo<int> in 
a.obj and b.obj and let c.obj resolve to the merged symbol. This also requires 
that the fat binary embedded in a.obj and b.obj must contain the identical 
definition of kernel foo<int>. That is, if ODR is followed, even though there 
are two fat binaries containing kernel foo<int>, only one of them will be used 
(it is fine since they are identical), which corresponds to the merged symbol 
for the kernel stub foo<int>.

The implication is that, we have to ask users to follow ODR even with the 
default -fno-gpu-rdc option. And users cannot have different definitions for 
the same template instantiation (e.g. foo<int>) in different TU's, otherwise 
there will be UB.

Considering ODR is a fundamental assumption for C++, I think it is justifiable 
to request users to follow that no matter whether -fgpu-rdc or -fno-gpu-rdc.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112492/new/

https://reviews.llvm.org/D112492

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D112492: [CUDA][HIP] Allow comdat for kernels

Reply via email to