I open a Github issue here: https://github.com/JuliaOpt/Optim.jl/issues/59
Dahua On Wednesday, June 4, 2014 10:43:56 AM UTC-5, John Myles White wrote: > > Exposing the option to control the initial approximate Hessian would be a > good idea. If you don’t mind, I’d encourage you to put a little time into > learning GitHub since it will make it a lot easier in the future for you to > help fix problems you’re running into. > > Fixing our line search routine would also be a good thing, but is > potentially harder. I find the line search code quite difficult to reason > about, so it takes me a long time to convince myself that a proposed change > is safe to make. > > — John > > On Jun 4, 2014, at 8:38 AM, Thomas Covert <[email protected] > <javascript:>> wrote: > > > When calling for a BFGS or L-BFGS optimization in Optim.jl, the initial > inverse hessian is set to the identity matrix. I am running into some > trouble with this design decision, as my objective function is numerically > unstable at the first function evaluation point x0 - (I \ g0), where x0 is > what I know to be a good starting value and g0 is the gradient at that > point. If I were to compute a finite-difference hessian H, this first > point would be x0 - (H \ g0) which works fine (as far as I can tell). > > > > The numerical stability issues come from 2 sources: (1) the fact that g0 > contains entries generally quite a bit larger in absolute value than x0 and > (2) my objective function calls for a cholesky factorization of a matrix > which is partially defined by x, and this seems to fail for very large > absolute values of x. > > > > When I peeked at the source for Optim.jl, I noticed that all the > underlying newton-style solver routines allow for an explicitly defined > initial inverse hessian, but this interface does not seem to be exposed in > optimize(). Is it possible to change this? I can see how the code would > change, but I'm not github proficient (yet), so I don't now how to make > these changes and offer them as a pull request... > > > > By the way, MATLAB's fminunc and fmincon don't seem to suffer from this > problem since MATLAB's line search operation is able to recover from a > Cholesky error and just look for a teensy-tiny step size that works. > > > > Thanks in advance for any help or suggestions. > > > > -Thom > >
