You are computing the product of a [160080, 3] and a [3, 160080] matrix,
so the result is a [160080, 160080] matrix. To store a matrix of that
size (as float32) you would need 95GB of RAM. That's a though fit for a
3GB GPU ;-)
On 2015-11-23 17:10, Keith Brown wrote:
I have a 2 small matrix (160080,3) of type float32 and I am
calculating their dot product. While doing this, I keep getting
pycuda.__driver.MemoryError: cuMemAlloc failed out of memory.
I have 2 cards, each with 3GB of memory. Each matrix takes about 1875
kilobytes. I am not sure why this is occuring.
x=np.ones((160080,3L)).astype(np.float32)
a_gpu=gpuarray.to_gpu(x)
b_gpu=gpuarray.to_gpu(x)
c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle)
My handle is a cublasxt (not regular cublas since blasxt apprently
does better memory handling).
Any idea what is going on?
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda