Re: [julia-users] Optim.jl line search problems

Thomas Covert Wed, 20 Aug 2014 07:17:03 -0700

Ok after reading the paper which the hz_linesearch! routine is based on, I 
can see that I'm wrong about this.  Still puzzled, but definitely wrong!


On Tuesday, August 19, 2014 1:51:37 PM UTC-5, Thomas Covert wrote:
>
> I'm seeing this same error (ERROR: assertion failed: lsr.slope[ib] < 0) 
> again, and this time my gradients (evaluated at "reasonable" input values) 
> match the finite difference output generated by Calculus.jl's "gradient" 
> function.  The function I am trying to minize is globally convex (its a 
> multinomial logit log-likelihood).
>
> I encounter this assertion error after a few successful iterations of BFGS 
> and it is caused by NAN's in the gradient of the test point.  BFGS gets to 
> this
> test point because the step size it passes to hz_linesearch eventually 
> gets to be large, and a big enough step can cause floating point errors in 
> the calculation of the the derivatives.  For example, on a recent 
> minimization attempt, the assertion error happens when "c" (the step size 
> passed by bfgs to hz_linesearch) appears to be about 380.
>
> I think this is happening because hz_linesearch (a) expands the step size 
> by a factor of 5 (see line 280 in hz_linesearch) until it encounters upward 
> movement and (b) passes this new value (or a moving average of it) back to 
> the caller (i.e., bfgs).  So, the next time bfgs calls hz_linesearch, it 
> starts out with a potentially large value for the first step.
>
> I don't really know much about line search routines, but is this way 
> things ought to be?  I would have thought that for each new call to a line 
> search routine, the step size should reset to a default value.
>
> By the way, is it possible to enable display of the internal values of "c" 
> in the line search routines?  It looks like there is some debugging code in 
> there but I'm not sure how to turn it on.
>
> -thom
>
>
> On Wednesday, July 30, 2014 6:24:26 PM UTC-5, John Myles White wrote:
>>
>> I’ve never seen our line search methods produce an error that wasn’t 
>> caused by errors in the gradient. The line search methods generally only 
>> work with function values and gradients, so they’re either buggy (which 
>> they haven’t proven to be) or they’re brittle to errors in function 
>> definitions/gradient definitions.
>>
>> Producing better error message would be great. I once started to do that, 
>> but realized that I needed to come back to fully understanding the line 
>> search code before I could insert useful errors. Would love to see 
>> improvements there.
>>
>>  — John
>>
>> On Jul 30, 2014, at 3:17 PM, Thomas Covert <[email protected]> wrote:
>>
>> I've done some more sleuthing and have concluded that the problem was on 
>> my end (a bug in the gradient calculation, as you predicted). 
>>
>> Is an inaccurate gradient the only way someone should encounter this 
>> assertion error?  I don't know enough about line search methods to have an 
>> intuition about that, but if it is the case, maybe the line search routine 
>> should throw a more informative error?
>>
>> -Thom
>>
>> On Wednesday, July 30, 2014 3:44:51 PM UTC-5, John Myles White wrote:
>>>
>>> Would be useful to understand exactly what goes wrong if we want to fix 
>>> this problem. I’m mostly used to errors caused by inaccurate gradients, so 
>>> I don’t have an intuition for the cause of this problem.
>>>  
>>> — John
>>>
>>> On Jul 30, 2014, at 10:45 AM, Thomas Covert <[email protected]> wrote:
>>>
>>> No, I haven't tried that yet - might someday, but I like the idea of 
>>> running julia native code all the way...  
>>>
>>> However, I did find that manually switching the line search routine to 
>>> "backtracking_linesearch!" did the trick, so at least we know the problem 
>>> isn't in Optim.jl's implementation of BFGS itself!
>>>
>>> -thom
>>>
>>> On Wednesday, July 30, 2014 12:43:16 PM UTC-5, jbeginner wrote:
>>>>
>>>> This is not really a solution for this problem but have you tried the 
>>>> NLopt library? From my experience it produces much more stable results and 
>>>> because of problems like the one you describe I have switched to it. I 
>>>> think there is an L-BFGS option also. Although I did not get AD to work 
>>>> with it. The description for all algorithms can be seen here:
>>>>
>>>> http://ab-initio.mit.edu/wiki/index.php/NLopt_Algorithms
>>>>
>>>>
>>>>
>>>> On Wednesday, July 30, 2014 12:27:36 PM UTC-4, Thomas Covert wrote:
>>>>>
>>>>> Recently I've encountered line search errors when using Optim.jl with 
>>>>> BFGS.  Here is an example error message
>>>>>
>>>>> *ERROR: assertion failed: lsr.slope[ib] < 0*
>>>>>
>>>>> * in bisect! at 
>>>>> /pathtojulia/.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:577*
>>>>>
>>>>> * in hz_linesearch! at /**pathtojulia*
>>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:273*
>>>>>
>>>>> * in hz_linesearch! at /**pathtojulia*
>>>>> */.julia/v0.3/Optim/src/linesearch/hz_linesearch.jl:201*
>>>>>
>>>>> * in bfgs at /**pathtojulia**/.julia/v0.3/Optim/src/bfgs.jl:121*
>>>>>
>>>>> * in optimize at /**pathtojulia*
>>>>> */.julia/v0.3/Optim/src/optimize.jl:113*
>>>>>
>>>>> *while loading /pathtocode/code.jl, in expression starting on line 229*
>>>>>
>>>>>
>>>>> I've seen this error message before, and its usually because I have a 
>>>>> bug in my code that erroneously generates function values or gradients 
>>>>> which are very large (i.e., 1e100).  However, in this case I can confirm 
>>>>> that the "x" I've passed to the optimizer is totally reasonable (abs 
>>>>> value 
>>>>> of all points less than 100), the function value at that x is reasonable 
>>>>> (on the order of 1e6), the gradients are  reasonable (between -100 and 
>>>>> +100), and the entries in the approximate inverse Hessian are also 
>>>>> reasonable (smallest abs value is about 1e-9, largest is about 7).  
>>>>>
>>>>>
>>>>> This isn't a failure on the first or second iteration of BFGS - it 
>>>>> happens on the 34th iteration.
>>>>>
>>>>>
>>>>> Unfortunately its pretty hard for me to share my code or data at the 
>>>>> moment, so I understand that it might be challenging to solve this 
>>>>> problem 
>>>>> but any advice you guys can offer is appreciated!
>>>>>
>>>>>
>>>>> -Thom
>>>>>
>>>>
>>>
>>

Re: [julia-users] Optim.jl line search problems

Reply via email to