Hi Simon, I forgot to mention that I use the gradient of Cholesky, and this has even more error than the Cholesky decomo, but I assume that this is because of a bug in Cholesky itself. Paul
On Wednesday, February 5, 2020 at 5:30:10 PM UTC+1, Paul Baggenstoss wrote: > > Hi Simon,I have uploaded the MATLAB format file with the matrix Cll, which > is the original matrix, and R_cpu which was produced using CPU by > slinalg.Cholesky( ), and R_cuda which > was produced by the same function, but with GPU ( I think it uses > theano.gpuarray.linalg.GpuCholesky() ) Both used the same precision > (float32) so should give the same results. > But you can see that at the end of the diagonal, the values go wild. It > appears to be numericla errors. > Thanks in advance! > Paul > > > > > On Wednesday, February 5, 2020 at 4:56:14 PM UTC+1, Wong Hang wrote: >> >> >> Hi, >> >> The GPU cholesky decomposition relies on cuSOLVER or Magma. I believe >> nvidia knows their hardware well and cuSOLVER should provide the best >> efficient result. >> >> Although cholesky decomposition is very numerical stable, when I write >> the test case, I find that I will get trouble for relatively small matrix >> if I use single-precision. >> >> Are you using single-precision on a big matrix? >> If not, try to compute the condition number of the matrix to see if it is >> too big. >> >> If it is not too big, then it may be a bug. I also need to use the >> cholesky operator, Please send me the matrix and I am willing to fix it. >> >> Best, >> >> 2020年2月6日(木) 0:34 Paul Baggenstoss <[email protected]>: >> >>> HI Simon, I was wondering if you got anywhere with the faster Cholesky >>> for Theano. I also use it a lot and have found it to be unstable on the GPU. >>> Paul >>> >>> On Saturday, March 7, 2015 at 11:45:36 AM UTC+1, Simon Ebner wrote: >>>> >>>> Hi all, >>>> >>>> I want to do computations where I rely heavily on the Cholesky >>>> decomposition. Writing a small benchmark for tensor.slinalg.Cholesky, I >>>> noticed that the implementation is not as fast as I hoped. As far as I can >>>> tell it is not optimized for GPUs yet but relies on the scipy >>>> implementation? >>>> Doing a bit of a google seach I found several cuda implementations for >>>> fast Cholesky decompositions on the GPU. Before I try to include that code >>>> into my theano environment, could you let me know whether you decided not >>>> to implement fast Cholesky decomposition on the GPU on purpose? >>>> Furthermore, since I'm fairly new to theano I'm not completely confident >>>> how to incorporate cuda code best into my existing theano code. Is the >>>> sensible to create a custom OP with optimized C-Code? >>>> >>>> Best, >>>> Simon >>>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "theano-users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/theano-users/17db7972-40e8-43f4-8387-83615142a23f%40googlegroups.com.
