On Sat, Mar 24, 2012 at 20:02, Ezequiel Alfíe <[email protected]> wrote:
> Answering my own question:
>
> The code below seems to work but is odd enough to doubt about.
>
> I guess there's a better, like the intended, way to do it... do you
> have any suggestions?
>
> ## file test.cu
>
> #include <stdio.h>
>
> extern "C" {
> __global__ void use_my_array();
> __device__ float *my_array;
> }
> /* simple kernel just to see values of my_array */
> __global__ void use_my_array() {
>    int i = threadIdx.x;
>    printf("%.2e\n", my_array[i]);
>    return;
> }
>
> ## compile with: nvcc -arch=sm_21 -cubin test.cu
>
> ## file test.py
>
> import struct
> import numpy as np
> import pycuda.autoinit
> from pycuda.gpuarray import GPUArray
> import pycuda.driver as drv
>
> mod=drv.module_from_file('test.cubin')
> use_my_array=mod.get_function('use_my_array')
>
> #array of 4 float32s on host with random values
>
> array_host=np.zeros((4,), np.float32)
> r=np.random.random(4)
> array_host[:] = r[:]
>
> #allocate from pycuda
> array_gpu=GPUArray((4,), np.float32)
>
> #copy values from array_host
> array_gpu.set(array_host)
>
> # get pointer of pycuda allocated gpu array
> gpu_pointer = array_gpu.ptr
>
> # get address of gpu_pointer and its size
> my_array_pointer, pointer_size = mod.get_global('my_array')
>
> # convert gpu_pointer into string of bytes and
> # copy it to the address of my_array pointer
> drv.memcpy_htod(my_array_pointer, struct.pack("Q", long(gpu_pointer)))

According to python's struct module, I guess this should be "@p"
(native pointer) instead of "Q" (long int).

(of course happen to be the same on amd64)





>
> use_my_array(block=(4,1,1))
> print array_host
>
>
> On Sat, Mar 24, 2012 at 19:08, Ezequiel Alfíe <[email protected]> wrote:
>> Hello everyone. I'd want to know:
>>
>> is it possible (how?) to reserve memory from pycuda and assign the
>> resulting device pointer to a "global" pointer on a cuda module?
>>
>> like, for instance, having inside a test.cu file
>>
>> __device__ float *my_array;
>>
>> __global__ void somekernel() {
>>   int i = threadIdx.x;
>>   my_array[i] = 0.3f;
>> }
>>
>> (have test.cu compiled into cubin test.cubin)
>>
>> and from python calling:
>> import pycuda.autoinit
>> import pycuda.driver as drv
>> mod=drv.module_from_file('test.cubin')
>>
>> #dynamcally choose size
>> size = 4 * 1000 # 4 is sizeof float32
>> mem1 = drv.mem_alloc(size)
>> mem1_pointer = (long) mem1
>>
>> my_array_pointer = mod.get_global('my_array')
>>
>> memcpy_htod(my_array_pointer, mem1_pointer)
>> # this fails with TypeError: expected a readable buffer object
>>
>> --
>> How do I write the mem1_pointer to my_array ?
>> Is there a simpler aproach to reserving global memory from pycuda and
>> assigning to a global pointer variable so it is usable from kernels?
>> (without passing it as kernel parameter)
>>
>> Thanks,
>> Ezequiel

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to