When calling for a BFGS or L-BFGS optimization in Optim.jl, the initial 
inverse hessian is set to the identity matrix.  I am running into some 
trouble with this design decision, as my objective function is numerically 
unstable at the first function evaluation point x0 - (I \ g0), where x0 is 
what I know to be a good starting value and g0 is the gradient at that 
point.  If I were to compute a finite-difference hessian H, this first 
point would be x0 - (H \ g0) which works fine (as far as I can tell).

The numerical stability issues come from 2 sources: (1) the fact that g0 
contains entries generally quite a bit larger in absolute value than x0 and 
(2) my objective function calls for a cholesky factorization of a matrix 
which is partially defined by x, and this seems to fail for very large 
absolute values of x.

When I peeked at the source for Optim.jl, I noticed that all the underlying 
newton-style solver routines allow for an explicitly defined initial 
inverse hessian, but this interface does not seem to be exposed in 
optimize().  Is it possible to change this?  I can see how the code would 
change, but I'm not github proficient (yet), so I don't now how to make 
these changes and offer them as a pull request...

By the way, MATLAB's fminunc and fmincon don't seem to suffer from this 
problem since MATLAB's line search operation is able to recover from a 
Cholesky error and just look for a teensy-tiny step size that works.

Thanks in advance for any help or suggestions.

-Thom

Reply via email to