[PyCUDA] question to gpuarray.mul_add

Andreas Baumbach Tue, 16 Jul 2013 12:19:16 -0700

Hi,

after I finally managed to subscribe to the mailing list I just ran in
another issue. I'm still trying to implement a conjungate gradient method.
That already works but the speed up vs Scipy with CUBLAS optimisation is
only a factor of 4.


Basically I need to store an scalar value (one double) on the GPU (as
opposed to the main RAM of now) and pass this value as an argument to the
mul_add-function.
I tried using one-entry GPUarrays, but the only way I got this to work is
via gpuarray.get() which only transfers it to CPU to put it back on the
GPU. Which gives no speedup at all. Is there anyway to get this working?

Cheers,
Andi

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

[PyCUDA] question to gpuarray.mul_add

Reply via email to