Hi Thomas,
This is definitely a reasonable thing to want to do, and there are a number
of methods to differentiate through a matrix factorization. Julia doesn't
yet have a generic chol() method
(see https://github.com/JuliaLang/julia/issues/1629), but it does have a
generic LU decomposition method that works with DualNumbers. So if you
replace cholfact with lufact your code might just work. However, this won't
use LAPACK, and off-hand I'm not sure off hand if it's possible to
reformulate a Cholesky with dual numbers into something LAPACK knows how to
compute. There also might be some performance issues with your hand-written
Cholesky factorization, try:
function choldn!{T}(A::Matrix{T})
instead of
function choldn!(A::Array)
...
T = eltype(A);
Exactly how much slower is it?
On Monday, June 2, 2014 2:43:32 PM UTC-4, Thomas Covert wrote:
>
> I'm really excited to use DualNumbers and the autodiff option in Optim.jl.
> However, the functions I would like to optimize often make use of linear
> algebra, including cholesky factorizations. At present, DualNumbers.jl
> does not implement a cholesky factorization. My first pass at a workaround
> for this was to just write out a standard cholesky solver (i.e., just
> writing out the equations listed in wikipedia). This seems to work fine -
> i.e., my "choldn()" function factors a DualNumber array into upper and
> lower triangular matrices whose product is the input.
>
> However, my "choldn" function listed below is MUCH slower than the LAPACK
> call built into chol(). I can't say I'm surprised by this, but the speed
> hit makes the whole enterprise of using autodiff with DualNumbers
> considerably less useful.
>
> I was hoping to find some neat linear algebra trick that would let me
> compute a DualNumber cholesky factorization without having to resort to
> non-LAPACK code, but I haven't found it yet. That is, I figured that I
> could compute the cholesky factorization of the real part of the matrix and
> then separately compute a matrix which represented the DualNumber part.
> However, I've not yet found a math text describing such a beast.
>
> Any ideas? Am I wrong in thinking that a Cholesky decomposition of a
> square array with DualNumbers is a well-defined mathematical object?
>
> I've included my code for "choldn()" below.
>
> Thanks in advance.
>
>
> CODE
>
> function choldn!(A::Array)
>
> N = size(A,1);
>
> T = eltype(A);
>
> tmp = zero(T);
>
> for j = 1:N # cols
>
>
> for i = j:N # rows
>
>
> if i == j
>
> if i == 1
>
> A[i,j] = sqrt(A[i,j]);
>
> else
>
> tmp = zero(T);
>
> for k = 1:j-1
>
> tmp += A[j,k] ^ 2;
>
> end
>
> A[i,j] = sqrt(A[i,j] - tmp);
>
> end
>
> else
>
> if j == 1
>
> A[i,j] = A[i,j] / A[j,j];
>
> else
>
> tmp = zero(T);
>
> for k = 1:j-1
>
> tmp += A[i,k] * A[j,k];
>
> end
>
> A[i,j] = (A[i,j] - tmp) / A[j,j];
>
> end
>
> end
>
> end
>
> end
>
> # then zero out the non-cholesky stuff
>
>
> for i = 1:N
>
> if i < N
>
> for j = i+1:N
>
> A[i,j] = zero(T);
>
> end
>
> end
>
> end
>
> end
>
>