>>> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle)
Isn't your output matrix of size 160080x160080?
Yiyin


On Mon, Nov 23, 2015 at 11:43 AM, Keith Brown <[email protected]> wrote:

> I modified add_dot() to use cublas.xt.cublasXtSgemm. I don't think I
> need to modify dot() because its calling add_dot at the end. Its not
> calling cublasxt.cublasXtsgemm directly unless my matrix is 1d (which
> it isn't) Correct?
>
> BTW, smaller matrices work fine its just for larger matrices.
>
>
> On Mon, Nov 23, 2015 at 11:35 AM, Lev Givon <[email protected]> wrote:
> > Received from Keith Brown on Mon, Nov 23, 2015 at 11:10:45AM EST:
> >> I have a 2 small matrix (160080,3) of type float32 and I am
> >> calculating their dot product. While doing this, I keep getting
> >> pycuda.__driver.MemoryError: cuMemAlloc failed out of memory.
> >>
> >> I have 2 cards, each with 3GB of memory. Each matrix takes about 1875
> >> kilobytes. I am not sure why this is occuring.
> >>
> >> x=np.ones((160080,3L)).astype(np.float32)
> >> a_gpu=gpuarray.to_gpu(x)
> >> b_gpu=gpuarray.to_gpu(x)
> >> c_gpu = linalg.dot(a_gpu,b_gpu,'N','T',handle=handle)
> >>
> >> My handle is a cublasxt (not regular cublas since blasxt apprently
> >> does better memory handling).
> >>
> >> Any idea what is going on?
> >
> > Did you also modify skcuda.linalg.dot() to explicitly call the
> cublasXt*gemm
> > functions rather than the stock cublas*gemm functions? The cublasXt*gemm
> > functions expect host memory pointers as their arguments, not GPU memory
> > pointers.
> > --
> > Lev Givon
> > Bionet Group | Neurokernel Project
> > http://lebedov.github.io/
> > http://neurokernel.github.io/
> >
>
> _______________________________________________
> PyCUDA mailing list
> [email protected]
> http://lists.tiker.net/listinfo/pycuda
>
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to