================
Comment at: lib/Headers/cuda/cuda_builtin_vars.h:89
@@ +88,3 @@
+// architectures have a WARP_SZ value of 32'.
+__CUDA_BUILTIN_VAR int warpSize = 32;
+
----------------
rsmith wrote:
> tra wrote:
> > rsmith wrote:
> > > tra wrote:
> > > > rsmith wrote:
> > > > > This is a (strong) definition due to the `extern` in the
> > > > > `__CUDA_BUILTIN_VAR` macro, and will cause a link error if this file
> > > > > is `#include`d into multiple objects in the same binary. You could
> > > > > solve this by removing the `extern` (that'd give the variable
> > > > > internal linkage) or by using `enum { warpSize = 32 };` or similar.
> > > > > I'm not sure exactly what the CUDA spec requires here.
> > > > Those have to be extern because no instances of those special variables
> > > > can ever exist, yet they are referred to as if they do. 'extern'
> > > > matches this functionality best. I've added a weak attribute to make
> > > > linking work. That is, if/when we'll get to do any linking on the GPU
> > > > code. Currently we produce PTX assembly and we don't ever link one PTX
> > > > file with another.
> > > >
> > > > I've reworked warpSize into a class with the same restrictions on
> > > > construction and access as the other built-in variables.
> > > >
> > > > As for CUDA spec, it does not say much -- 'They [built-in variables]
> > > > are only valid within functions that are executed on the device.
> > > > [warpSize] is of type int and contains the warp size in threads.'
> > > >
> > > >
> > > I was only talking about `warpSize`, not the others. The others seem OK
> > > as (non-weak) `extern` declarations, on the assumption that programs
> > > aren't actually allowed to use the address of these.
> > >
> > > Your new approach doesn't satisfy the requirement that "`warpSize` is of
> > > type `int`": this is detectable through template argument deduction, for
> > > instance:
> > >
> > > template<typename T> T max(T, T)
> > > max(warpSize, 10); // error, T is ambiguous, could be int or
> > > __cuda_builtin_warpSize_t
> > >
> > > Using
> > >
> > > __attribute__((device)) const int warpSize = 32;
> > >
> > > would probably work (that is, drop the `extern` from what you had before).
> > Another downside of warpSize-in-a-class was that a reference to an instance
> > of that class was required for the type conversion operator as it can't be
> > a static member function. Scratch that.
> >
> > Dropping extern is not sufficient. 'const int' suffers from the same
> > problem as 'extern' -- we end up with warpSize visible externally which
> > causes linking error.
> >
> > I could use 'static const int' -- it keeps the symbol local and does not
> > create linking issues. Unfortunately it allows one to take address of the
> > variable. It's not perfect, but we can live with that for now.
> >
> >
> >
> >
> In C++, `const` on a global variable definition (that is not explicitly
> marked `extern`) implies internal linkage just like `static` does -- the
> `static` is harmless but redundant.
Thanks. I was unaware of that. Will fix.
http://reviews.llvm.org/D9064
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits