I'm really excited to use DualNumbers and the autodiff option in Optim.jl. 
 However, the functions I would like to optimize often make use of linear 
algebra, including cholesky factorizations.  At present, DualNumbers.jl 
does not implement a cholesky factorization.  My first pass at a workaround 
for this was to just write out a standard cholesky solver  (i.e., just 
writing out the equations listed in wikipedia).  This seems to work fine - 
i.e., my "choldn()" function factors a DualNumber array into upper and 
lower triangular matrices whose product is the input.  

However, my "choldn" function listed below is MUCH slower than the LAPACK 
call built into chol().  I can't say I'm surprised by this, but the speed 
hit makes the whole enterprise of using autodiff with DualNumbers 
considerably less useful.

I was hoping to find some neat linear algebra trick that would let me 
compute a DualNumber cholesky factorization without having to resort to 
non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
could compute the cholesky factorization of the real part of the matrix and 
then separately compute a matrix which represented the DualNumber part. 
 However, I've not yet found a math text describing such a beast.

Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
square array with DualNumbers is a well-defined mathematical object?

I've included my code for "choldn()" below.

Thanks in advance.


CODE 

function choldn!(A::Array)

    N = size(A,1);

    T = eltype(A);

    tmp = zero(T);

    for j = 1:N # cols                                                     
     

        for i = j:N # rows                                                 
     

            if i == j

                if i == 1

                    A[i,j] = sqrt(A[i,j]);

                else

                    tmp = zero(T);

                    for k = 1:j-1

                        tmp += A[j,k] ^ 2;

                    end

                    A[i,j] = sqrt(A[i,j] - tmp);

                end

            else

                if j == 1

                    A[i,j] = A[i,j] / A[j,j];

                else

                    tmp = zero(T);

                    for k = 1:j-1

                        tmp += A[i,k] * A[j,k];

                    end

                    A[i,j] = (A[i,j] - tmp) / A[j,j];

                end

            end

        end

    end

    # then zero out the non-cholesky stuff                                 
     

    for i = 1:N

        if i < N

            for j = i+1:N

                A[i,j] = zero(T);

            end

        end

    end

end

Reply via email to