Hello CUDA,

I try to speed up my Python program with a not so trivial algorithm, so I need to know:*
*

What is the correct way of transferring a list of lists of floats to the (Py)CUDA Kernel?

*An example*

given as example the following list

|listToProc =[[-1,-2,-3,-4,-5],[1,2,3,4,5,6,7,8.1,9]]|

it shall be transfered to a PyCUDA kernel for further processing. I would then proceed with common functions to transfer a list of values (not a list of lists) like this

|listToProcAr =np.array(listToProc,dtype=np.object)listToProcAr_gpu =cuda.mem_alloc(listToProcAr.nbytes)cuda.memcpy_htod(listToProcAr_gpu,listToProcAr)|

*However this results in two problems:*

1) |listToProcAr.nbytes = 2| - i.e. too less memory is reserved. I believe this can be solved by

|listBytes =0forcurrentList inListToProc:listBytes +=np.array(currentList,dtype=np.float32).nbytes|

and replace the variable here

|listToProcAr_gpu =cuda.mem_alloc(listBytes)|

2) and the *actual problem*

|cuda.memcpy_htod(listToProcAr_gpu, listToProcAr)| still seems to create a wrong pointer in the Kernel. Because when trying to access the last element of the second list (listToProc[1][8]) raises an

   PyCUDA WARNING: a clean-up operation failed (dead context maybe?)

So I'm a little bit clueless at the moment

------------------------------------------------------------------------

*The PyCUDA code*

|__global__ void procTheListKernel(float **listOfLists){listOfLists[0][0]=0;listOfLists[1][8]=0;__syncthreads();} Can anyone help me out? Kind Regards Frank |

_______________________________________________
PyCUDA mailing list
[email protected]
https://lists.tiker.net/listinfo/pycuda

Reply via email to