Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Paul Baggenstoss
Hi wonghang, Sorry to pester you with emails, but I have some interesting timing information. I ran a process using different processors and ways of computing Cholesky() The results are surprising. GpuMagmaCholesky()9.0 sec slinalg.Cholesky(uses cusolver) 2.9 sec CPU

Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Wong Hang
Hi all, I found that the cholesky factorization unit test no longer works. The value returned are completely wrong. It looks like a memory error. I checked if I skip tril call, the value returned by cuSOLVER is correct. There should be something wrong in libgpuarray

Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Paul Baggenstoss
Hi Wong Hang, Yes, that's what I saw, the errors started near the end of the matrix. After that, the numbers appeared random. I'll try the older version and let you know what I find, Paul On Friday, February 7, 2020 at 3:18:23 PM UTC+1, Wong Hang wrote: > > I suddenly get the HEAD version of

Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Wong Hang
I suddenly get the HEAD version of libgpuarray works I found that if I increase the size of the matrix, the error will appear. The first few rows of the matrix are correct, and then there will be errors for the remaining rows. I guess there is a synchronization or memory bug. $ python3 cho.py row

Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Wong Hang
I am quite sure I once get correct result even when the matrix is of size >1000. Let me do more research and test later and get back to you. Wong Hang 於 2020年2月7日 週五 下午11:18寫道: > I suddenly get the HEAD version of libgpuarray works > I found that if I increase the size of the matrix, the error

Re: [theano-users] Re: Cholesky decomposition slow

2020-02-07 Thread Wong Hang
Hi Paul, I think I fixed the issue. Please check the PR https://github.com/Theano/libgpuarray/pull/589 and you can try to use my branch of libgpuarray to see if it works for you. For your implementation of MagmaCholesky, you can add profile = True in ~/.theano.rc to see what is the bottleneck of