>>> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle) Isn't your output matrix of size 160080x160080? Yiyin
On Mon, Nov 23, 2015 at 11:43 AM, Keith Brown <[email protected]> wrote: > I modified add_dot() to use cublas.xt.cublasXtSgemm. I don't think I > need to modify dot() because its calling add_dot at the end. Its not > calling cublasxt.cublasXtsgemm directly unless my matrix is 1d (which > it isn't) Correct? > > BTW, smaller matrices work fine its just for larger matrices. > > > On Mon, Nov 23, 2015 at 11:35 AM, Lev Givon <[email protected]> wrote: > > Received from Keith Brown on Mon, Nov 23, 2015 at 11:10:45AM EST: > >> I have a 2 small matrix (160080,3) of type float32 and I am > >> calculating their dot product. While doing this, I keep getting > >> pycuda.__driver.MemoryError: cuMemAlloc failed out of memory. > >> > >> I have 2 cards, each with 3GB of memory. Each matrix takes about 1875 > >> kilobytes. I am not sure why this is occuring. > >> > >> x=np.ones((160080,3L)).astype(np.float32) > >> a_gpu=gpuarray.to_gpu(x) > >> b_gpu=gpuarray.to_gpu(x) > >> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle) > >> > >> My handle is a cublasxt (not regular cublas since blasxt apprently > >> does better memory handling). > >> > >> Any idea what is going on? > > > > Did you also modify skcuda.linalg.dot() to explicitly call the > cublasXt*gemm > > functions rather than the stock cublas*gemm functions? The cublasXt*gemm > > functions expect host memory pointers as their arguments, not GPU memory > > pointers. > > -- > > Lev Givon > > Bionet Group | Neurokernel Project > > http://lebedov.github.io/ > > http://neurokernel.github.io/ > > > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda >
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
