Thanks for the quick response.  I'll check and see if an LU decomposition
will do the trick

As far as my own cholesky code, it is quite a bit slower, even after your
edits.  For a 1000x1000 Float64 matrix (i.e., X = randn(1000,1000); X = X'
* X), here are the timing results I get:

3 Calls of @time choldn(X), the slow one I wrote

elapsed time: 1.313375682 seconds (9102620 bytes allocated)

elapsed time: 1.231566056 seconds (8000096 bytes allocated)

elapsed time: 1.240786689 seconds (8000096 bytes allocated)


3 calls of @time chol(X), the built in cholesky factorizer.

elapsed time: 0.032613575 seconds (8000736 bytes allocated)

elapsed time: 0.041127357 seconds (8000688 bytes allocated)

elapsed time: 0.066425175 seconds (8000688 bytes allocated)


If I had to guess, the significant timing differences here are due to smart
multi-threading in the LAPACK cholesky code.

For reference, these are running times on my mid-2009 macbook pro with 8
gigs of ram and 2.53 GHz Core 2 Duo.

On Mon, Jun 2, 2014 at 4:36 PM, Miles Lubin <miles.lu...@gmail.com> wrote:

> Hi Thomas,
>
> This is definitely a reasonable thing to want to do, and there are a
> number of methods to differentiate through a matrix factorization. Julia
> doesn't yet have a generic chol() method (see
> https://github.com/JuliaLang/julia/issues/1629), but it does have a
> generic LU decomposition method that works with DualNumbers. So if you
> replace cholfact with lufact your code might just work. However, this won't
> use LAPACK, and off-hand I'm not sure off hand if it's possible to
> reformulate a Cholesky with dual numbers into something LAPACK knows how to
> compute. There also might be some performance issues with your hand-written
> Cholesky factorization, try:
>
> function choldn!{T}(A::Matrix{T})
>
> instead of
>
> function choldn!(A::Array)
>  ...
>  T = eltype(A);
>
>
> Exactly how much slower is it?
>
>
> On Monday, June 2, 2014 2:43:32 PM UTC-4, Thomas Covert wrote:
>>
>> I'm really excited to use DualNumbers and the autodiff option in
>> Optim.jl.  However, the functions I would like to optimize often make use
>> of linear algebra, including cholesky factorizations.  At present,
>> DualNumbers.jl does not implement a cholesky factorization.  My first pass
>> at a workaround for this was to just write out a standard cholesky solver
>>  (i.e., just writing out the equations listed in wikipedia).  This seems to
>> work fine - i.e., my "choldn()" function factors a DualNumber array into
>> upper and lower triangular matrices whose product is the input.
>>
>> However, my "choldn" function listed below is MUCH slower than the LAPACK
>> call built into chol().  I can't say I'm surprised by this, but the speed
>> hit makes the whole enterprise of using autodiff with DualNumbers
>> considerably less useful.
>>
>> I was hoping to find some neat linear algebra trick that would let me
>> compute a DualNumber cholesky factorization without having to resort to
>> non-LAPACK code, but I haven't found it yet.  That is, I figured that I
>> could compute the cholesky factorization of the real part of the matrix and
>> then separately compute a matrix which represented the DualNumber part.
>>  However, I've not yet found a math text describing such a beast.
>>
>> Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a
>> square array with DualNumbers is a well-defined mathematical object?
>>
>> I've included my code for "choldn()" below.
>>
>> Thanks in advance.
>>
>>
>> CODE
>>
>> function choldn!(A::Array)
>>
>>     N = size(A,1);
>>
>>     T = eltype(A);
>>
>>     tmp = zero(T);
>>
>>     for j = 1:N # cols
>>
>>
>>         for i = j:N # rows
>>
>>
>>             if i == j
>>
>>                 if i == 1
>>
>>                     A[i,j] = sqrt(A[i,j]);
>>
>>                 else
>>
>>                     tmp = zero(T);
>>
>>                     for k = 1:j-1
>>
>>                         tmp += A[j,k] ^ 2;
>>
>>                     end
>>
>>                     A[i,j] = sqrt(A[i,j] - tmp);
>>
>>                 end
>>
>>             else
>>
>>                 if j == 1
>>
>>                     A[i,j] = A[i,j] / A[j,j];
>>
>>                 else
>>
>>                     tmp = zero(T);
>>
>>                     for k = 1:j-1
>>
>>                         tmp += A[i,k] * A[j,k];
>>
>>                     end
>>
>>                     A[i,j] = (A[i,j] - tmp) / A[j,j];
>>
>>                 end
>>
>>             end
>>
>>         end
>>
>>     end
>>
>>     # then zero out the non-cholesky stuff
>>
>>
>>     for i = 1:N
>>
>>         if i < N
>>
>>             for j = i+1:N
>>
>>                 A[i,j] = zero(T);
>>
>>             end
>>
>>         end
>>
>>     end
>>
>> end
>>
>>

Reply via email to