================
Comment at: lib/Headers/cuda/cuda_builtin_vars.h:89
@@ +88,3 @@
+// architectures have a WARP_SZ value of 32'.
+__CUDA_BUILTIN_VAR int warpSize = 32;
+
----------------
rsmith wrote:
> tra wrote:
> > rsmith wrote:
> > > tra wrote:
> > > > rsmith wrote:
> > > > > This is a (strong) definition due to the `extern` in the 
> > > > > `__CUDA_BUILTIN_VAR` macro, and will cause a link error if this file 
> > > > > is `#include`d into multiple objects in the same binary. You could 
> > > > > solve this by removing the `extern` (that'd give the variable 
> > > > > internal linkage) or by using `enum { warpSize = 32 };` or similar. 
> > > > > I'm not sure exactly what the CUDA spec requires here.
> > > > Those have to be extern because no instances of those special variables 
> > > > can ever exist, yet they are referred to as if they do. 'extern' 
> > > > matches this functionality best. I've added a weak attribute to make 
> > > > linking work. That is, if/when we'll get to do any linking on the GPU 
> > > > code. Currently we produce PTX assembly and we don't ever link one PTX 
> > > > file with another.
> > > > 
> > > > I've reworked warpSize into a class with the same restrictions on 
> > > > construction and access as the other built-in variables.
> > > > 
> > > > As for CUDA spec, it does not say much -- 'They [built-in variables] 
> > > > are only valid within functions that are executed on the device. 
> > > > [warpSize] is of type int and contains the warp size in threads.'
> > > > 
> > > > 
> > > I was only talking about `warpSize`, not the others. The others seem OK 
> > > as (non-weak) `extern` declarations, on the assumption that programs 
> > > aren't actually allowed to use the address of these.
> > > 
> > > Your new approach doesn't satisfy the requirement that "`warpSize` is of 
> > > type `int`": this is detectable through template argument deduction, for 
> > > instance:
> > > 
> > >   template<typename T> T max(T, T)
> > >   max(warpSize, 10); // error, T is ambiguous, could be int or 
> > > __cuda_builtin_warpSize_t
> > > 
> > > Using
> > > 
> > >   __attribute__((device)) const int warpSize = 32;
> > > 
> > > would probably work (that is, drop the `extern` from what you had before).
> > Another downside of warpSize-in-a-class was that a reference to an instance 
> > of that class was required for the type conversion operator as it can't be 
> > a static member function. Scratch that.
> > 
> > Dropping extern is not sufficient. 'const int' suffers from the same 
> > problem as 'extern' -- we end up with warpSize visible externally which 
> > causes linking error.
> > 
> > I could use 'static const int' -- it keeps the symbol local and does not 
> > create linking issues. Unfortunately it allows one to take address of the 
> > variable. It's not perfect, but we can live with that for now.
> > 
> > 
> > 
> > 
> In C++, `const` on a global variable definition (that is not explicitly 
> marked `extern`) implies internal linkage just like `static` does -- the 
> `static` is harmless but redundant.
Thanks. I was unaware of that. Will fix.

http://reviews.llvm.org/D9064

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to