Hi Paul, I haven't test your matrix on my machine. I don't have time until weekend. I don't think cuSOLVER would produce the same result for GPU and CPU. cuSOLVER would try to parallelize in a sophisticated way to improve performance, but their error should be within a threshold.
If Magma cholesky decomposition is more stable, it is possible to implement a gradient operator like GpuCholesky did. Just add support for float64 and implement the L_op method. Best regards, wonghang Paul Baggenstoss <[email protected]> 於 2020年2月6日 週四 下午6:28寫道: > Simon, > I did more digging and have some more information. I tested > theano.gpuarray.linalg.GpuMagmaCholesky(), on float32 and it looks good. > The result is exactly the same as for CPU. > So the problem seems to be in CUsolver. The problem is that > theano.gpuarray.linalg.GpuMagmaCholesky()(Cll) does not define a gradient > and works only for > float32. I installed the latest magma-2.5.2 and it has support for double > precision Cholesky (dpotrf) but Theano seems to use it's own copy of the > MAGMA source. > Not sure how that works. Can I force Theano to use magma-2.5.2 ? If not, > it seems feasible to borrow the gradient from > theano.gpuarray.linalg.GpuCholesky() > and add support for float64 as well. Thoughts? > Paul > > > On Wednesday, February 5, 2020 at 5:32:43 PM UTC+1, Paul Baggenstoss wrote: >> >> Hi Simon, I forgot to mention that I use the gradient of Cholesky, and >> this has even more error than the Cholesky decomo, but I assume that this >> is because >> of a bug in Cholesky itself. >> Paul >> >> >> On Wednesday, February 5, 2020 at 5:30:10 PM UTC+1, Paul Baggenstoss >> wrote: >>> >>> Hi Simon,I have uploaded the MATLAB format file with the matrix Cll, >>> which is the original matrix, and R_cpu which was produced using CPU by >>> slinalg.Cholesky( ), and R_cuda which >>> was produced by the same function, but with GPU ( I think it uses >>> theano.gpuarray.linalg.GpuCholesky() ) Both used the same precision >>> (float32) so should give the same results. >>> But you can see that at the end of the diagonal, the values go wild. It >>> appears to be numericla errors. >>> Thanks in advance! >>> Paul >>> >>> >>> >>> >>> On Wednesday, February 5, 2020 at 4:56:14 PM UTC+1, Wong Hang wrote: >>>> >>>> >>>> Hi, >>>> >>>> The GPU cholesky decomposition relies on cuSOLVER or Magma. I believe >>>> nvidia knows their hardware well and cuSOLVER should provide the best >>>> efficient result. >>>> >>>> Although cholesky decomposition is very numerical stable, when I write >>>> the test case, I find that I will get trouble for relatively small matrix >>>> if I use single-precision. >>>> >>>> Are you using single-precision on a big matrix? >>>> If not, try to compute the condition number of the matrix to see if it >>>> is too big. >>>> >>>> If it is not too big, then it may be a bug. I also need to use the >>>> cholesky operator, Please send me the matrix and I am willing to fix it. >>>> >>>> Best, >>>> >>>> 2020年2月6日(木) 0:34 Paul Baggenstoss <[email protected]>: >>>> >>>>> HI Simon, I was wondering if you got anywhere with the faster Cholesky >>>>> for Theano. I also use it a lot and have found it to be unstable on the >>>>> GPU. >>>>> Paul >>>>> >>>>> On Saturday, March 7, 2015 at 11:45:36 AM UTC+1, Simon Ebner wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> I want to do computations where I rely heavily on the Cholesky >>>>>> decomposition. Writing a small benchmark for tensor.slinalg.Cholesky, I >>>>>> noticed that the implementation is not as fast as I hoped. As far as I >>>>>> can >>>>>> tell it is not optimized for GPUs yet but relies on the scipy >>>>>> implementation? >>>>>> Doing a bit of a google seach I found several cuda implementations >>>>>> for fast Cholesky decompositions on the GPU. Before I try to include that >>>>>> code into my theano environment, could you let me know whether you >>>>>> decided >>>>>> not to implement fast Cholesky decomposition on the GPU on purpose? >>>>>> Furthermore, since I'm fairly new to theano I'm not completely confident >>>>>> how to incorporate cuda code best into my existing theano code. Is the >>>>>> sensible to create a custom OP with optimized C-Code? >>>>>> >>>>>> Best, >>>>>> Simon >>>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "theano-users" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> To view this discussion on the web visit >>>>> https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com >>>>> <https://groups.google.com/d/msgid/theano-users/aca41c35-ec36-4055-bac7-e53aced30ea7%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>> . >>>>> >>>> -- > > --- > You received this message because you are subscribed to the Google Groups > "theano-users" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/theano-users/cbd1feec-2403-487b-809e-241a225a3ae4%40googlegroups.com > <https://groups.google.com/d/msgid/theano-users/cbd1feec-2403-487b-809e-241a225a3ae4%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/theano-users/CAAMb3nVt2v0Wa%3D7RRLi78EJFjO%3DXSGwHDDGhCqOOYK%2BWGC%2BZNg%40mail.gmail.com.
