I'm really excited to use DualNumbers and the autodiff option in Optim.jl.
However, the functions I would like to optimize often make use of linear
algebra, including cholesky factorizations. At present, DualNumbers.jl
does not implement a cholesky factorization. My first pass at a workaround
for this was to just write out a standard cholesky solver (i.e., just
writing out the equations listed in wikipedia). This seems to work fine -
i.e., my "choldn()" function factors a DualNumber array into upper and
lower triangular matrices whose product is the input.
However, my "choldn" function listed below is MUCH slower than the LAPACK
call built into chol(). I can't say I'm surprised by this, but the speed
hit makes the whole enterprise of using autodiff with DualNumbers
considerably less useful.
I was hoping to find some neat linear algebra trick that would let me
compute a DualNumber cholesky factorization without having to resort to
non-LAPACK code, but I haven't found it yet. That is, I figured that I
could compute the cholesky factorization of the real part of the matrix and
then separately compute a matrix which represented the DualNumber part.
However, I've not yet found a math text describing such a beast.
Any ideas? Am I wrong in thinking that a Cholesky decomposition of a
square array with DualNumbers is a well-defined mathematical object?
I've included my code for "choldn()" below.
Thanks in advance.
CODE
function choldn!(A::Array)
N = size(A,1);
T = eltype(A);
tmp = zero(T);
for j = 1:N # cols
for i = j:N # rows
if i == j
if i == 1
A[i,j] = sqrt(A[i,j]);
else
tmp = zero(T);
for k = 1:j-1
tmp += A[j,k] ^ 2;
end
A[i,j] = sqrt(A[i,j] - tmp);
end
else
if j == 1
A[i,j] = A[i,j] / A[j,j];
else
tmp = zero(T);
for k = 1:j-1
tmp += A[i,k] * A[j,k];
end
A[i,j] = (A[i,j] - tmp) / A[j,j];
end
end
end
end
# then zero out the non-cholesky stuff
for i = 1:N
if i < N
for j = i+1:N
A[i,j] = zero(T);
end
end
end
end