This is the linalg from cublas wrapper. Not from numpy. On Mon, Nov 23, 2015 at 11:24 AM, Andreas Kloeckner <[email protected]> wrote: > Keith Brown <[email protected]> writes: > >> I have a 2 small matrix (160080,3) of type float32 and I am >> calculating their dot product. While doing this, I keep getting >> pycuda.__driver.MemoryError: cuMemAlloc failed out of memory. >> >> I have 2 cards, each with 3GB of memory. Each matrix takes about 1875 >> kilobytes. I am not sure why this is occuring. >> >> x=np.ones((160080,3L)).astype(np.float32) >> a_gpu=gpuarray.to_gpu(x) >> b_gpu=gpuarray.to_gpu(x) >> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle) >> >> My handle is a cublasxt (not regular cublas since blasxt apprently >> does better memory handling). > > Is "linalg" the regular numpy linalg module? If so, that's not going to > work, because that effectively accesses the GPU array element-by-element > across the PCIe bus. You probably need to call a cublas wrapper. > > Andreas
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
