It is indeed a deep copy. It has to be since 32 bit floating point numbers use a different representation for numbers than 64 bit floating point numbers, even for the exact values which they have in common.
Meanwhile, a C expression such as float *a= (float *) &b; means that the address which is used to store b becomes the value of a. Roughly speaking: memory is an array and a becomes the index of b in that array. In terms of machine instructions, this can be a bit tricky, because b might be stored in a register, and because the compiler does not, in general, know what other code does (the linker will have that information, but then there's shared libraries and other such things which postpone linking for quite some time, ... anyways...). So basically any time the compiler has to deal with code which it doesn't know about it needs to store b in memory (if its value was in a register) so that any references to b's address will also have access to b's value. Needless to say, this grates on the nerves of people who write C compilers (among others). But there's always tradeoffs, and bigger problems to deal with... That said, I've never gotten going with CUDA so I can't offer anything useful there. I'd be interested in hearing about your discoveries. Thanks, -- Raul On Sat, Aug 30, 2014 at 2:01 AM, Scott Locklin <[email protected]> wrote: > Raul wrote: >>This isn't really worth a pull request, but double->float is 1&(3!:5) >>and float->double is _1&(3!:5) > >>This is documented at >>http://www.jsoftware.com/help/dictionary/dx003.htm >> <http://www.jsoftware.com/help/dictionary/dx003.htm>. (Note that floats >>are represented as a sequence of literals, because J can't work with >>them.) > > Thanks Raul: I've used 3!:5/fc at some point in my brief J career to > accomplish this (I think in the flann hooks). > > For my education: when I do a=. 1&(3!:5)b, this does a deep copy, right? > Aka, there isn't some weird bit masking thing, with a and b referring to the > same memory location, is there? I confess, I never quite understood what the > C compiler does when you do something like a = (float *) &b; -I always > figured it was passing to a by reference somehow. > > Speaking of deep copy, the only tricky part of this is going to be pushing > the data to the GPU and pulling it back out again. If I can return some kind > of useful pointers to the GPU data, the way it is done in Torch7, it should > work pretty well. > > I'm quite pleased with the blas-mp speedup; it's already useful. > > -SL > > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
