Ok after reading the paper which the hz_linesearch! routine is based on, I can see that I'm wrong about this. Still puzzled, but definitely wrong!
On Tuesday, August 19, 2014 1:51:37 PM UTC-5, Thomas Covert wrote: > > I'm seeing this same error (ERROR: assertion failed: lsr.slope[ib] < 0) > again, and this time my gradients (evaluated at "reasonable" input values) > match the finite difference output generated by Calculus.jl's "gradient" > function. The function I am trying to minize is globally convex (its a > multinomial logit log-likelihood). > > I encounter this assertion error after a few successful iterations of BFGS > and it is caused by NAN's in the gradient of the test point. BFGS gets to > this > test point because the step size it passes to hz_linesearch eventually > gets to be large, and a big enough step can cause floating point errors in > the calculation of the the derivatives. For example, on a recent > minimization attempt, the assertion error happens when "c" (the step size > passed by bfgs to hz_linesearch) appears to be about 380. > > I think this is happening because hz_linesearch (a) expands the step size > by a factor of 5 (see line 280 in hz_linesearch) until it encounters upward > movement and (b) passes this new value (or a moving average of it) back to > the caller (i.e., bfgs). So, the next time bfgs calls hz_linesearch, it > starts out with a potentially large value for the first step. > > I don't really know much about line search routines, but is this way > things ought to be? I would have thought that for each new call to a line > search routine, the step size should reset to a default value. > > By the way, is it possible to enable display of the internal values of "c" > in the line search routines? It looks like there is some debugging code in > there but I'm not sure how to turn it on. > > -thom > > > On Wednesday, July 30, 2014 6:24:26 PM UTC-5, John Myles White wrote: >> >> I’ve never seen our line search methods produce an error that wasn’t >> caused by errors in the gradient. The line search methods generally only >> work with function values and gradients, so they’re either buggy (which >> they haven’t proven to be) or they’re brittle to errors in function >> definitions/gradient definitions. >> >> Producing better error message would be great. I once started to do that, >> but realized that I needed to come back to fully understanding the line >> search code before I could insert useful errors. Would love to see >> improvements there. >> >> — John >> >> On Jul 30, 2014, at 3:17 PM, Thomas Covert <[email protected]> wrote: >> >> I've done some more sleuthing and have concluded that the problem was on >> my end (a bug in the gradient calculation, as you predicted). >> >> Is an inaccurate gradient the only way someone should encounter this >> assertion error? I don't know enough about line search methods to have an >> intuition about that, but if it is the case, maybe the line search routine >> should throw a more informative error? >> >> -Thom >> >> On Wednesday, July 30, 2014 3:44:51 PM UTC-5, John Myles White wrote: >>> >>> Would be useful to understand exactly what goes wrong if we want to fix >>> this problem. I’m mostly used to errors caused by inaccurate gradients, so >>> I don’t have an intuition for the cause of this problem. >>> >>> — John >>> >>> On Jul 30, 2014, at 10:45 AM, Thomas Covert <[email protected]> wrote: >>> >>> No, I haven't tried that yet - might someday, but I like the idea of >>> running julia native code all the way... >>> >>> However, I did find that manually switching the line search routine to >>> "backtracking_linesearch!" did the trick, so at least we know the problem >>> isn't in Optim.jl's implementation of BFGS itself! >>> >>> -thom >>> >>> On Wednesday, July 30, 2014 12:43:16 PM UTC-5, jbeginner wrote: >>>> >>>> This is not really a solution for this problem but have you tried the >>>> NLopt library? From my experience it produces much more stable results and >>>> because of problems like the one you describe I have switched to it. I >>>> think there is an L-BFGS option also. Although I did not get AD to work >>>> with it. The description for all algorithms can be seen here: >>>> >>>> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms >>>> >>>> >>>> >>>> On Wednesday, July 30, 2014 12:27:36 PM UTC-4, Thomas Covert wrote: >>>>> >>>>> Recently I've encountered line search errors when using Optim.jl with >>>>> BFGS. Here is an example error message >>>>> >>>>> *ERROR: assertion failed: lsr.slope[ib] < 0* >>>>> >>>>> * in bisect! at >>>>> /pathtojulia/.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:577* >>>>> >>>>> * in hz_linesearch! at /**pathtojulia* >>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:273* >>>>> >>>>> * in hz_linesearch! at /**pathtojulia* >>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:201* >>>>> >>>>> * in bfgs at /**pathtojulia**/.julia/v0.3/Optim/src/bfgs.jl:121* >>>>> >>>>> * in optimize at /**pathtojulia* >>>>> */.julia/v0.3/Optim/src/optimize.jl:113* >>>>> >>>>> *while loading /pathtocode/code.jl, in expression starting on line 229* >>>>> >>>>> >>>>> I've seen this error message before, and its usually because I have a >>>>> bug in my code that erroneously generates function values or gradients >>>>> which are very large (i.e., 1e100). However, in this case I can confirm >>>>> that the "x" I've passed to the optimizer is totally reasonable (abs >>>>> value >>>>> of all points less than 100), the function value at that x is reasonable >>>>> (on the order of 1e6), the gradients are reasonable (between -100 and >>>>> +100), and the entries in the approximate inverse Hessian are also >>>>> reasonable (smallest abs value is about 1e-9, largest is about 7). >>>>> >>>>> >>>>> This isn't a failure on the first or second iteration of BFGS - it >>>>> happens on the 34th iteration. >>>>> >>>>> >>>>> Unfortunately its pretty hard for me to share my code or data at the >>>>> moment, so I understand that it might be challenging to solve this >>>>> problem >>>>> but any advice you guys can offer is appreciated! >>>>> >>>>> >>>>> -Thom >>>>> >>>> >>> >>
