Correct. My result matrix will be too large. <sigh>
I would think cublasXT would take care of this for me. I though it would do some sort of divide and conquer. Is there a way to attack this sort of problem? On Mon, Nov 23, 2015 at 11:38 AM, Jonas Bardino <[email protected]> wrote: > Ehmm, I'm not sure I understand exactly what you do, but to me it sounds > like you try to calculate the dot product of a 160080 x 3 matrix and a > similar one transposed, i.e. a 3 x 160080 matrix. That would give you a > 160080 x 160080 matrix result - which surely won't fit your 3GB of GPU > memory. > > Cheers, Jonas > > On 2015-11-23 17:10, Keith Brown wrote: >> I have a 2 small matrix (160080,3) of type float32 and I am >> calculating their dot product. While doing this, I keep getting >> pycuda.__driver.MemoryError: cuMemAlloc failed out of memory. >> >> I have 2 cards, each with 3GB of memory. Each matrix takes about 1875 >> kilobytes. I am not sure why this is occuring. >> >> x=np.ones((160080,3L)).astype(np.float32) >> a_gpu=gpuarray.to_gpu(x) >> b_gpu=gpuarray.to_gpu(x) >> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle) >> >> My handle is a cublasxt (not regular cublas since blasxt apprently >> does better memory handling). >> >> Any idea what is going on? >> >> _______________________________________________ >> PyCUDA mailing list >> [email protected] >> http://lists.tiker.net/listinfo/pycuda >> > > > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda > _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
