I am trying to write a gpu kernel to expand an array to a much larger sparse array using PyCUDA (which I will then perform some linear algebra on). Currently I have a simple cpu based implementation working nicely, and I was hoping that PyCUDA would have the tools to allow me to perform such operation in parallel without manually going down to C kernel using memoryIDs.
I need to read a row in from from A_gpu, and then make some modifications to specific elements of M_gpu. Such an uncoupled task screams for GPGPU. I am stuck because I cannot seem to figure out how to retrieve a single element from a gpuarray without writing a C kernel using memory addresses. Is there anything to do: M_gpu[i,j] in PyCUDA that I missed reading about in the Docs? Thanks, ~Garrett ps. I love PyCUDA!!!
_______________________________________________ PyCUDA mailing list pyc...@host304.hostmonster.com http://host304.hostmonster.com/mailman/listinfo/pycuda_tiker.net