It is indeed a deep copy. It has to be since 32 bit floating point
numbers use a different representation for numbers than 64 bit
floating point numbers, even for the exact values which they have in
common.

Meanwhile, a C expression such as
   float *a= (float *) &b;
means that the address which is used to store b becomes the value of
a.  Roughly speaking: memory is an array and a becomes the index of b
in that array.

In terms of machine instructions, this can be a bit tricky, because b
might be stored in a register, and because the compiler does not, in
general, know what other code does (the linker will have that
information, but then there's shared libraries and other such things
which postpone linking for quite some time, ... anyways...).

So basically any time the compiler has to deal with code which it
doesn't know about it needs to store b in memory (if its value was in
a register) so that any references to b's address will also have
access to b's value. Needless to say, this grates on the nerves of
people who write C compilers (among others). But there's always
tradeoffs, and bigger problems to deal with...

That said, I've never gotten going with CUDA so I can't offer anything
useful there.

I'd be interested in hearing about your discoveries.

Thanks,

-- 
Raul


On Sat, Aug 30, 2014 at 2:01 AM, Scott Locklin <[email protected]> wrote:
> Raul wrote:
>>This isn't really worth a pull request, but double->float is 1&(3!:5)
>>and float->double is _1&(3!:5)
>
>>This is documented at
>>http://www.jsoftware.com/help/dictionary/dx003.htm
>> <http://www.jsoftware.com/help/dictionary/dx003.htm>. (Note that floats
>>are represented as a sequence of literals, because J can't work with
>>them.)
>
> Thanks Raul: I've used 3!:5/fc at some point in my brief J career to
> accomplish this (I think in the flann hooks).
>
> For my education: when I do a=. 1&(3!:5)b, this does a  deep copy, right?
> Aka, there isn't some weird bit masking thing, with a and b referring to the
> same memory location, is there? I confess, I never quite understood what the
> C  compiler does when you do something like a = (float *) &b; -I always
> figured it was passing to a by reference somehow.
>
> Speaking of deep copy, the only tricky part of this is going to be pushing
> the data to the GPU and pulling it back out again. If I can return some kind
> of useful pointers to the GPU data, the way it is done in Torch7, it should
> work pretty well.
>
> I'm quite pleased with the blas-mp speedup; it's already useful.
>
> -SL
>
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to