yaxunl created this revision. yaxunl added reviewers: tra, rjmccall. constexpr variables are compile time constants and implicitly const, therefore they are safe to emit on both device and host side. Besides, in many cases they are intended for both device and host, therefore it makes sense to emit them on both device and host sides if necessary.
In most cases constexpr variables are used as rvalue and the variables themselves do not need to be emitted. However if their address is taken, then they need to be emitted. The following example shows constexpr is available on device side without `__device__` or `__constant__` attribute https://godbolt.org/z/Uf7CgK Which indicates that we need to emit constexpr variables on device side even without `__device__` or `__constant__` attribute when necessary. This should be OK since the initializer is compile time constant and the variable itself is constant. They can just be emitted in the same way as the host side. For C++14, clang is able to handle that since clang emits them with available_externally linkage together with the initializer. However for C++17, the constexpr static data member of a class or template class become inline variables implicitly. Therefore they become definitions with linkonce_odr or weak_odr linkages. As such, they can not have available_externally linkage. This patch fixes that by allowing clang to emit definitions of constexpr variables in the standard way on device side. https://reviews.llvm.org/D79237 Files: clang/lib/CodeGen/CodeGenModule.cpp clang/test/CodeGenCUDA/constexpr-variables.cu Index: clang/test/CodeGenCUDA/constexpr-variables.cu =================================================================== --- /dev/null +++ clang/test/CodeGenCUDA/constexpr-variables.cu @@ -0,0 +1,30 @@ +// RUN: %clang_cc1 -std=c++14 %s -emit-llvm -o - -triple nvptx \ +// RUN: -fcuda-is-device | FileCheck --check-prefixes=COM,CXX14 %s +// RUN: %clang_cc1 -std=c++17 %s -emit-llvm -o - -triple nvptx \ +// RUN: -fcuda-is-device | FileCheck --check-prefixes=COM,CXX17 %s + +#include "Inputs/cuda.h" + +// COM: @_ZL1a = internal {{.*}}constant i32 7 +constexpr int a = 7; +__constant__ const int &use_a = a; + +namespace B { + // COM: @_ZN1BL1bE = internal {{.*}}constant i32 9 + constexpr int b = 9; +} +__constant__ const int &use_B_b = B::b; + +struct Q { + // CXX14: @_ZN1Q1kE = available_externally {{.*}}constant i32 5 + // CXX17: @_ZN1Q1kE = linkonce_odr {{.*}}constant i32 5 + static constexpr int k = 5; +}; +__constant__ const int &use_Q_k = Q::k; + +template<typename T> struct X { + // CXX14: @_ZN1XIiE1aE = available_externally {{.*}}constant i32 123 + // CXX17: @_ZN1XIiE1aE = linkonce_odr {{.*}}constant i32 123 + static constexpr int a = 123; +}; +__constant__ const int &use_X_a = X<int>::a; Index: clang/lib/CodeGen/CodeGenModule.cpp =================================================================== --- clang/lib/CodeGen/CodeGenModule.cpp +++ clang/lib/CodeGen/CodeGenModule.cpp @@ -2549,12 +2549,16 @@ // If this is CUDA, be selective about which declarations we emit. if (LangOpts.CUDA) { if (LangOpts.CUDAIsDevice) { + bool IsConstexprVar = false; + if (auto *VD = dyn_cast<VarDecl>(Global)) + IsConstexprVar = VD->isConstexpr(); if (!Global->hasAttr<CUDADeviceAttr>() && !Global->hasAttr<CUDAGlobalAttr>() && !Global->hasAttr<CUDAConstantAttr>() && !Global->hasAttr<CUDASharedAttr>() && !Global->getType()->isCUDADeviceBuiltinSurfaceType() && - !Global->getType()->isCUDADeviceBuiltinTextureType()) + !Global->getType()->isCUDADeviceBuiltinTextureType() && + !IsConstexprVar) return; } else { // We need to emit host-side 'shadows' for all global
Index: clang/test/CodeGenCUDA/constexpr-variables.cu =================================================================== --- /dev/null +++ clang/test/CodeGenCUDA/constexpr-variables.cu @@ -0,0 +1,30 @@ +// RUN: %clang_cc1 -std=c++14 %s -emit-llvm -o - -triple nvptx \ +// RUN: -fcuda-is-device | FileCheck --check-prefixes=COM,CXX14 %s +// RUN: %clang_cc1 -std=c++17 %s -emit-llvm -o - -triple nvptx \ +// RUN: -fcuda-is-device | FileCheck --check-prefixes=COM,CXX17 %s + +#include "Inputs/cuda.h" + +// COM: @_ZL1a = internal {{.*}}constant i32 7 +constexpr int a = 7; +__constant__ const int &use_a = a; + +namespace B { + // COM: @_ZN1BL1bE = internal {{.*}}constant i32 9 + constexpr int b = 9; +} +__constant__ const int &use_B_b = B::b; + +struct Q { + // CXX14: @_ZN1Q1kE = available_externally {{.*}}constant i32 5 + // CXX17: @_ZN1Q1kE = linkonce_odr {{.*}}constant i32 5 + static constexpr int k = 5; +}; +__constant__ const int &use_Q_k = Q::k; + +template<typename T> struct X { + // CXX14: @_ZN1XIiE1aE = available_externally {{.*}}constant i32 123 + // CXX17: @_ZN1XIiE1aE = linkonce_odr {{.*}}constant i32 123 + static constexpr int a = 123; +}; +__constant__ const int &use_X_a = X<int>::a; Index: clang/lib/CodeGen/CodeGenModule.cpp =================================================================== --- clang/lib/CodeGen/CodeGenModule.cpp +++ clang/lib/CodeGen/CodeGenModule.cpp @@ -2549,12 +2549,16 @@ // If this is CUDA, be selective about which declarations we emit. if (LangOpts.CUDA) { if (LangOpts.CUDAIsDevice) { + bool IsConstexprVar = false; + if (auto *VD = dyn_cast<VarDecl>(Global)) + IsConstexprVar = VD->isConstexpr(); if (!Global->hasAttr<CUDADeviceAttr>() && !Global->hasAttr<CUDAGlobalAttr>() && !Global->hasAttr<CUDAConstantAttr>() && !Global->hasAttr<CUDASharedAttr>() && !Global->getType()->isCUDADeviceBuiltinSurfaceType() && - !Global->getType()->isCUDADeviceBuiltinTextureType()) + !Global->getType()->isCUDADeviceBuiltinTextureType() && + !IsConstexprVar) return; } else { // We need to emit host-side 'shadows' for all global
_______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits