[julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-03 Thread Tony Kelman
Thomas,

If you're trying to invert (factorize) a matrix as part of your objective 
function, it's generally a much better idea to reformulate your 
optimization problem and use a different algorithm. Anywhere that you're 
essentially trying to use inv(A)*x, you should try introducing a new 
variable y and corresponding linear equality constraint x = A * y. Most of 
the algorithms in Optim  If your objective function in terms of x and y is 
linear, then you can use Clp, GLPK, etc. If it's still some complicated 
nonlinear thing even in terms of x and y, use JuMP with Ipopt. NLopt might 
work too, but it's not tied into the convenient JuMP reverse-mode autodiff 
framework yet.

-Tony


On Monday, June 2, 2014 11:43:32 AM UTC-7, Thomas Covert wrote:

 I'm really excited to use DualNumbers and the autodiff option in Optim.jl. 
  However, the functions I would like to optimize often make use of linear 
 algebra, including cholesky factorizations.  At present, DualNumbers.jl 
 does not implement a cholesky factorization.  My first pass at a workaround 
 for this was to just write out a standard cholesky solver  (i.e., just 
 writing out the equations listed in wikipedia).  This seems to work fine - 
 i.e., my choldn() function factors a DualNumber array into upper and 
 lower triangular matrices whose product is the input.  

 However, my choldn function listed below is MUCH slower than the LAPACK 
 call built into chol().  I can't say I'm surprised by this, but the speed 
 hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.

 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix and 
 then separately compute a matrix which represented the DualNumber part. 
  However, I've not yet found a math text describing such a beast.

 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?

 I've included my code for choldn() below.

 Thanks in advance.


 CODE 

 function choldn!(A::Array)

 N = size(A,1);

 T = eltype(A);

 tmp = zero(T);

 for j = 1:N # cols   


 for i = j:N # rows   


 if i == j

 if i == 1

 A[i,j] = sqrt(A[i,j]);

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[j,k] ^ 2;

 end

 A[i,j] = sqrt(A[i,j] - tmp);

 end

 else

 if j == 1

 A[i,j] = A[i,j] / A[j,j];

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[i,k] * A[j,k];

 end

 A[i,j] = (A[i,j] - tmp) / A[j,j];

 end

 end

 end

 end

 # then zero out the non-cholesky stuff   


 for i = 1:N

 if i  N

 for j = i+1:N

 A[i,j] = zero(T);

 end

 end

 end

 end



[julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-03 Thread Tony Kelman
Whoops, didn't finish writing a sentence there... was saying that most of 
the algorithms in Optim don't implement equality constraints, but you've 
got other choices of optimization solvers.


On Tuesday, June 3, 2014 3:39:44 AM UTC-7, Tony Kelman wrote:

 Thomas,

 If you're trying to invert (factorize) a matrix as part of your objective 
 function, it's generally a much better idea to reformulate your 
 optimization problem and use a different algorithm. Anywhere that you're 
 essentially trying to use inv(A)*x, you should try introducing a new 
 variable y and corresponding linear equality constraint x = A * y. Most of 
 the algorithms in Optim  If your objective function in terms of x and y is 
 linear, then you can use Clp, GLPK, etc. If it's still some complicated 
 nonlinear thing even in terms of x and y, use JuMP with Ipopt. NLopt might 
 work too, but it's not tied into the convenient JuMP reverse-mode autodiff 
 framework yet.

 -Tony


 On Monday, June 2, 2014 11:43:32 AM UTC-7, Thomas Covert wrote:

 I'm really excited to use DualNumbers and the autodiff option in 
 Optim.jl.  However, the functions I would like to optimize often make use 
 of linear algebra, including cholesky factorizations.  At present, 
 DualNumbers.jl does not implement a cholesky factorization.  My first pass 
 at a workaround for this was to just write out a standard cholesky solver 
  (i.e., just writing out the equations listed in wikipedia).  This seems to 
 work fine - i.e., my choldn() function factors a DualNumber array into 
 upper and lower triangular matrices whose product is the input.  

 However, my choldn function listed below is MUCH slower than the LAPACK 
 call built into chol().  I can't say I'm surprised by this, but the speed 
 hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.

 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix and 
 then separately compute a matrix which represented the DualNumber part. 
  However, I've not yet found a math text describing such a beast.

 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?

 I've included my code for choldn() below.

 Thanks in advance.


 CODE 

 function choldn!(A::Array)

 N = size(A,1);

 T = eltype(A);

 tmp = zero(T);

 for j = 1:N # cols   


 for i = j:N # rows   


 if i == j

 if i == 1

 A[i,j] = sqrt(A[i,j]);

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[j,k] ^ 2;

 end

 A[i,j] = sqrt(A[i,j] - tmp);

 end

 else

 if j == 1

 A[i,j] = A[i,j] / A[j,j];

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[i,k] * A[j,k];

 end

 A[i,j] = (A[i,j] - tmp) / A[j,j];

 end

 end

 end

 end

 # then zero out the non-cholesky stuff   


 for i = 1:N

 if i  N

 for j = i+1:N

 A[i,j] = zero(T);

 end

 end

 end

 end



Re: [julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-03 Thread Thomas R Covert
Thanks for the suggestion, Tony.  The problem I'm working with is that of 
fitting a gaussian process to data, so my objective function (the log 
likelihood of a GP given some data x) includes both an x' inv(K(theta)) x term 
and a log determinant of K(theta).  Here K() is a big dense  positive definite 
weight matrix and theta is the parameter(s) controlling the weights.

As far as I can tell, the fastest and most numerically stable way to implement 
these calculations is with a cholesky decomposition of K(theta).  In my code 
where I've already done the work of finding derivatives by hand, I use a PDMat 
for K and the associated fast routines (like invquad and logdet) available for 
PDMats.  To use the DualNumbers autodiff code, I needed a way to cholesky 
factorize a symmetric matrix of dual numbers (which I assume was something like 
K(theta + epsilon) in the autodiff code).

However if you think there's a smarter way to do this kind of calculation that 
avoids a cholesky factorization I would definitely take a look.

 On Jun 3, 2014, at 5:41 AM, Tony Kelman t...@kelman.net wrote:
 
 Whoops, didn't finish writing a sentence there... was saying that most of the 
 algorithms in Optim don't implement equality constraints, but you've got 
 other choices of optimization solvers.
 
 
 On Tuesday, June 3, 2014 3:39:44 AM UTC-7, Tony Kelman wrote:
 Thomas,
 
 If you're trying to invert (factorize) a matrix as part of your objective 
 function, it's generally a much better idea to reformulate your optimization 
 problem and use a different algorithm. Anywhere that you're essentially 
 trying to use inv(A)*x, you should try introducing a new variable y and 
 corresponding linear equality constraint x = A * y. Most of the algorithms 
 in Optim  If your objective function in terms of x and y is linear, then you 
 can use Clp, GLPK, etc. If it's still some complicated nonlinear thing even 
 in terms of x and y, use JuMP with Ipopt. NLopt might work too, but it's not 
 tied into the convenient JuMP reverse-mode autodiff framework yet.
 
 -Tony
 
 
 On Monday, June 2, 2014 11:43:32 AM UTC-7, Thomas Covert wrote:
 I'm really excited to use DualNumbers and the autodiff option in Optim.jl.  
 However, the functions I would like to optimize often make use of linear 
 algebra, including cholesky factorizations.  At present, DualNumbers.jl 
 does not implement a cholesky factorization.  My first pass at a workaround 
 for this was to just write out a standard cholesky solver  (i.e., just 
 writing out the equations listed in wikipedia).  This seems to work fine - 
 i.e., my choldn() function factors a DualNumber array into upper and 
 lower triangular matrices whose product is the input.  
 
 However, my choldn function listed below is MUCH slower than the LAPACK 
 call built into chol().  I can't say I'm surprised by this, but the speed 
 hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.
 
 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix and 
 then separately compute a matrix which represented the DualNumber part.  
 However, I've not yet found a math text describing such a beast.
 
 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?
 
 I've included my code for choldn() below.
 
 Thanks in advance.
 
 
 CODE 
 
 function choldn!(A::Array)
 N = size(A,1);
 T = eltype(A);
 tmp = zero(T);
 for j = 1:N # cols  

 for i = j:N # rows  

 if i == j
 if i == 1
 A[i,j] = sqrt(A[i,j]);
 else
 tmp = zero(T);
 for k = 1:j-1
 tmp += A[j,k] ^ 2;
 end
 A[i,j] = sqrt(A[i,j] - tmp);
 end
 else
 if j == 1
 A[i,j] = A[i,j] / A[j,j];
 else
 tmp = zero(T);
 for k = 1:j-1
 tmp += A[i,k] * A[j,k];
 end
 A[i,j] = (A[i,j] - tmp) / A[j,j];
 end
 end
 end
 end
 # then zero out the non-cholesky stuff  

 for i = 1:N
 if i  N
 for j = i+1:N
 A[i,j] = zero(T);
 end
 end
 end
 end
 


Re: [julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-03 Thread Tony Kelman
Alright if K is dense then Ipopt may not be your best choice. How do the 
theta parameters enter into K(theta)? Is the problem convex? Would the dual 
problem be easier to work with? Have you read Boyd and Vandenberghe more 
recently than I have? This sounds suspiciously like it can be expressed as 
a semidefinite program, though I think the Julia interfaces to good SDP 
solvers are still WIP.

Yes the majority of solvers for convex optimization rely on a Cholesky 
decomposition at some point, the question is whether or not you pose the 
problem in such a way that you have to differentiate the factors. I suspect 
you shouldn't have to, but also haven't worked with this particular type of 
problem in a few years.


On Tuesday, June 3, 2014 4:33:01 AM UTC-7, Thomas Covert wrote:

 Thanks for the suggestion, Tony.  The problem I'm working with is that of 
 fitting a gaussian process to data, so my objective function (the log 
 likelihood of a GP given some data x) includes both an x' inv(K(theta)) x 
 term and a log determinant of K(theta).  Here K() is a big dense  positive 
 definite weight matrix and theta is the parameter(s) controlling the 
 weights.

 As far as I can tell, the fastest and most numerically stable way to 
 implement these calculations is with a cholesky decomposition of K(theta). 
  In my code where I've already done the work of finding derivatives by 
 hand, I use a PDMat for K and the associated fast routines (like invquad 
 and logdet) available for PDMats.  To use the DualNumbers autodiff code, I 
 needed a way to cholesky factorize a symmetric matrix of dual numbers 
 (which I assume was something like K(theta + epsilon) in the autodiff code).

 However if you think there's a smarter way to do this kind of calculation 
 that avoids a cholesky factorization I would definitely take a look.

 On Jun 3, 2014, at 5:41 AM, Tony Kelman to...@kelman.net javascript: 
 wrote:

 Whoops, didn't finish writing a sentence there... was saying that most of 
 the algorithms in Optim don't implement equality constraints, but you've 
 got other choices of optimization solvers.


 On Tuesday, June 3, 2014 3:39:44 AM UTC-7, Tony Kelman wrote:

 Thomas,

 If you're trying to invert (factorize) a matrix as part of your objective 
 function, it's generally a much better idea to reformulate your 
 optimization problem and use a different algorithm. Anywhere that you're 
 essentially trying to use inv(A)*x, you should try introducing a new 
 variable y and corresponding linear equality constraint x = A * y. Most of 
 the algorithms in Optim  If your objective function in terms of x and y is 
 linear, then you can use Clp, GLPK, etc. If it's still some complicated 
 nonlinear thing even in terms of x and y, use JuMP with Ipopt. NLopt might 
 work too, but it's not tied into the convenient JuMP reverse-mode autodiff 
 framework yet.

 -Tony


 On Monday, June 2, 2014 11:43:32 AM UTC-7, Thomas Covert wrote:

 I'm really excited to use DualNumbers and the autodiff option in 
 Optim.jl.  However, the functions I would like to optimize often make use 
 of linear algebra, including cholesky factorizations.  At present, 
 DualNumbers.jl does not implement a cholesky factorization.  My first pass 
 at a workaround for this was to just write out a standard cholesky solver 
  (i.e., just writing out the equations listed in wikipedia).  This seems to 
 work fine - i.e., my choldn() function factors a DualNumber array into 
 upper and lower triangular matrices whose product is the input.  

 However, my choldn function listed below is MUCH slower than the 
 LAPACK call built into chol().  I can't say I'm surprised by this, but the 
 speed hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.

 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix and 
 then separately compute a matrix which represented the DualNumber part. 
  However, I've not yet found a math text describing such a beast.

 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?

 I've included my code for choldn() below.

 Thanks in advance.


 CODE 

 function choldn!(A::Array)

 N = size(A,1);

 T = eltype(A);

 tmp = zero(T);

 for j = 1:N # cols 
  

 for i = j:N # rows 
  

 if i == j

 if i == 1

 A[i,j] = sqrt(A[i,j]);

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[j,k] ^ 2;

 end

 A[i,j] = 

Re: [julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-03 Thread Thomas R Covert
In the GP setting I'm using, K is the set of pair wise gaussian kernel 
evaluations between all the points in my data.  So if a point is x_i,z_i (x is 
the dependent variable and the z's are effectively explanatory variables), K_ij 
is more or less exp(-.5 * (z_i - z_j)^2 * theta).  

I don't think the problem is convex, but MATLAB's fminunc + tens of random 
starting values always converge to the same point.

Anyway, a previous commenter pointed out that LU factorization is implemented 
for dual numbers, so I'll give that a try.


 On Jun 3, 2014, at 7:15 AM, Tony Kelman t...@kelman.net wrote:
 
 Alright if K is dense then Ipopt may not be your best choice. How do the 
 theta parameters enter into K(theta)? Is the problem convex? Would the dual 
 problem be easier to work with? Have you read Boyd and Vandenberghe more 
 recently than I have? This sounds suspiciously like it can be expressed as a 
 semidefinite program, though I think the Julia interfaces to good SDP solvers 
 are still WIP.
 
 Yes the majority of solvers for convex optimization rely on a Cholesky 
 decomposition at some point, the question is whether or not you pose the 
 problem in such a way that you have to differentiate the factors. I suspect 
 you shouldn't have to, but also haven't worked with this particular type of 
 problem in a few years.
 
 
 On Tuesday, June 3, 2014 4:33:01 AM UTC-7, Thomas Covert wrote:
 Thanks for the suggestion, Tony.  The problem I'm working with is that of 
 fitting a gaussian process to data, so my objective function (the log 
 likelihood of a GP given some data x) includes both an x' inv(K(theta)) x 
 term and a log determinant of K(theta).  Here K() is a big dense  positive 
 definite weight matrix and theta is the parameter(s) controlling the weights.
 
 As far as I can tell, the fastest and most numerically stable way to 
 implement these calculations is with a cholesky decomposition of K(theta).  
 In my code where I've already done the work of finding derivatives by hand, 
 I use a PDMat for K and the associated fast routines (like invquad and 
 logdet) available for PDMats.  To use the DualNumbers autodiff code, I 
 needed a way to cholesky factorize a symmetric matrix of dual numbers (which 
 I assume was something like K(theta + epsilon) in the autodiff code).
 
 However if you think there's a smarter way to do this kind of calculation 
 that avoids a cholesky factorization I would definitely take a look.
 
 On Jun 3, 2014, at 5:41 AM, Tony Kelman to...@kelman.net wrote:
 
 Whoops, didn't finish writing a sentence there... was saying that most of 
 the algorithms in Optim don't implement equality constraints, but you've 
 got other choices of optimization solvers.
 
 
 On Tuesday, June 3, 2014 3:39:44 AM UTC-7, Tony Kelman wrote:
 Thomas,
 
 If you're trying to invert (factorize) a matrix as part of your objective 
 function, it's generally a much better idea to reformulate your 
 optimization problem and use a different algorithm. Anywhere that you're 
 essentially trying to use inv(A)*x, you should try introducing a new 
 variable y and corresponding linear equality constraint x = A * y. Most of 
 the algorithms in Optim  If your objective function in terms of x and y is 
 linear, then you can use Clp, GLPK, etc. If it's still some complicated 
 nonlinear thing even in terms of x and y, use JuMP with Ipopt. NLopt might 
 work too, but it's not tied into the convenient JuMP reverse-mode autodiff 
 framework yet.
 
 -Tony
 
 
 On Monday, June 2, 2014 11:43:32 AM UTC-7, Thomas Covert wrote:
 I'm really excited to use DualNumbers and the autodiff option in 
 Optim.jl.  However, the functions I would like to optimize often make use 
 of linear algebra, including cholesky factorizations.  At present, 
 DualNumbers.jl does not implement a cholesky factorization.  My first 
 pass at a workaround for this was to just write out a standard cholesky 
 solver  (i.e., just writing out the equations listed in wikipedia).  This 
 seems to work fine - i.e., my choldn() function factors a DualNumber 
 array into upper and lower triangular matrices whose product is the 
 input.  
 
 However, my choldn function listed below is MUCH slower than the LAPACK 
 call built into chol().  I can't say I'm surprised by this, but the speed 
 hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.
 
 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix 
 and then separately compute a matrix which represented the DualNumber 
 part.  However, I've not yet found a math text describing such a beast.
 
 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?
 
 I've 

[julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-02 Thread Miles Lubin
Hi Thomas,

This is definitely a reasonable thing to want to do, and there are a number 
of methods to differentiate through a matrix factorization. Julia doesn't 
yet have a generic chol() method 
(see https://github.com/JuliaLang/julia/issues/1629), but it does have a 
generic LU decomposition method that works with DualNumbers. So if you 
replace cholfact with lufact your code might just work. However, this won't 
use LAPACK, and off-hand I'm not sure off hand if it's possible to 
reformulate a Cholesky with dual numbers into something LAPACK knows how to 
compute. There also might be some performance issues with your hand-written 
Cholesky factorization, try:

function choldn!{T}(A::Matrix{T})

instead of

function choldn!(A::Array)
 ...
 T = eltype(A);


Exactly how much slower is it?

On Monday, June 2, 2014 2:43:32 PM UTC-4, Thomas Covert wrote:

 I'm really excited to use DualNumbers and the autodiff option in Optim.jl. 
  However, the functions I would like to optimize often make use of linear 
 algebra, including cholesky factorizations.  At present, DualNumbers.jl 
 does not implement a cholesky factorization.  My first pass at a workaround 
 for this was to just write out a standard cholesky solver  (i.e., just 
 writing out the equations listed in wikipedia).  This seems to work fine - 
 i.e., my choldn() function factors a DualNumber array into upper and 
 lower triangular matrices whose product is the input.  

 However, my choldn function listed below is MUCH slower than the LAPACK 
 call built into chol().  I can't say I'm surprised by this, but the speed 
 hit makes the whole enterprise of using autodiff with DualNumbers 
 considerably less useful.

 I was hoping to find some neat linear algebra trick that would let me 
 compute a DualNumber cholesky factorization without having to resort to 
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I 
 could compute the cholesky factorization of the real part of the matrix and 
 then separately compute a matrix which represented the DualNumber part. 
  However, I've not yet found a math text describing such a beast.

 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a 
 square array with DualNumbers is a well-defined mathematical object?

 I've included my code for choldn() below.

 Thanks in advance.


 CODE 

 function choldn!(A::Array)

 N = size(A,1);

 T = eltype(A);

 tmp = zero(T);

 for j = 1:N # cols   


 for i = j:N # rows   


 if i == j

 if i == 1

 A[i,j] = sqrt(A[i,j]);

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[j,k] ^ 2;

 end

 A[i,j] = sqrt(A[i,j] - tmp);

 end

 else

 if j == 1

 A[i,j] = A[i,j] / A[j,j];

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[i,k] * A[j,k];

 end

 A[i,j] = (A[i,j] - tmp) / A[j,j];

 end

 end

 end

 end

 # then zero out the non-cholesky stuff   


 for i = 1:N

 if i  N

 for j = i+1:N

 A[i,j] = zero(T);

 end

 end

 end

 end



Re: [julia-users] Re: Optim/DualNumbers question (maybe a math question?)

2014-06-02 Thread Thomas Covert
Thanks for the quick response.  I'll check and see if an LU decomposition
will do the trick

As far as my own cholesky code, it is quite a bit slower, even after your
edits.  For a 1000x1000 Float64 matrix (i.e., X = randn(1000,1000); X = X'
* X), here are the timing results I get:

3 Calls of @time choldn(X), the slow one I wrote

elapsed time: 1.313375682 seconds (9102620 bytes allocated)

elapsed time: 1.231566056 seconds (896 bytes allocated)

elapsed time: 1.240786689 seconds (896 bytes allocated)


3 calls of @time chol(X), the built in cholesky factorizer.

elapsed time: 0.032613575 seconds (8000736 bytes allocated)

elapsed time: 0.041127357 seconds (8000688 bytes allocated)

elapsed time: 0.066425175 seconds (8000688 bytes allocated)


If I had to guess, the significant timing differences here are due to smart
multi-threading in the LAPACK cholesky code.

For reference, these are running times on my mid-2009 macbook pro with 8
gigs of ram and 2.53 GHz Core 2 Duo.

On Mon, Jun 2, 2014 at 4:36 PM, Miles Lubin miles.lu...@gmail.com wrote:

 Hi Thomas,

 This is definitely a reasonable thing to want to do, and there are a
 number of methods to differentiate through a matrix factorization. Julia
 doesn't yet have a generic chol() method (see
 https://github.com/JuliaLang/julia/issues/1629), but it does have a
 generic LU decomposition method that works with DualNumbers. So if you
 replace cholfact with lufact your code might just work. However, this won't
 use LAPACK, and off-hand I'm not sure off hand if it's possible to
 reformulate a Cholesky with dual numbers into something LAPACK knows how to
 compute. There also might be some performance issues with your hand-written
 Cholesky factorization, try:

 function choldn!{T}(A::Matrix{T})

 instead of

 function choldn!(A::Array)
  ...
  T = eltype(A);


 Exactly how much slower is it?


 On Monday, June 2, 2014 2:43:32 PM UTC-4, Thomas Covert wrote:

 I'm really excited to use DualNumbers and the autodiff option in
 Optim.jl.  However, the functions I would like to optimize often make use
 of linear algebra, including cholesky factorizations.  At present,
 DualNumbers.jl does not implement a cholesky factorization.  My first pass
 at a workaround for this was to just write out a standard cholesky solver
  (i.e., just writing out the equations listed in wikipedia).  This seems to
 work fine - i.e., my choldn() function factors a DualNumber array into
 upper and lower triangular matrices whose product is the input.

 However, my choldn function listed below is MUCH slower than the LAPACK
 call built into chol().  I can't say I'm surprised by this, but the speed
 hit makes the whole enterprise of using autodiff with DualNumbers
 considerably less useful.

 I was hoping to find some neat linear algebra trick that would let me
 compute a DualNumber cholesky factorization without having to resort to
 non-LAPACK code, but I haven't found it yet.  That is, I figured that I
 could compute the cholesky factorization of the real part of the matrix and
 then separately compute a matrix which represented the DualNumber part.
  However, I've not yet found a math text describing such a beast.

 Any ideas?  Am I wrong in thinking that a Cholesky decomposition of a
 square array with DualNumbers is a well-defined mathematical object?

 I've included my code for choldn() below.

 Thanks in advance.


 CODE

 function choldn!(A::Array)

 N = size(A,1);

 T = eltype(A);

 tmp = zero(T);

 for j = 1:N # cols


 for i = j:N # rows


 if i == j

 if i == 1

 A[i,j] = sqrt(A[i,j]);

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[j,k] ^ 2;

 end

 A[i,j] = sqrt(A[i,j] - tmp);

 end

 else

 if j == 1

 A[i,j] = A[i,j] / A[j,j];

 else

 tmp = zero(T);

 for k = 1:j-1

 tmp += A[i,k] * A[j,k];

 end

 A[i,j] = (A[i,j] - tmp) / A[j,j];

 end

 end

 end

 end

 # then zero out the non-cholesky stuff


 for i = 1:N

 if i  N

 for j = i+1:N

 A[i,j] = zero(T);

 end

 end

 end

 end