Re: [OctDev] lists in 'optim'

Jaroslav Hajek Mon, 22 Mar 2010 03:53:17 -0700

On Mon, Mar 22, 2010 at 11:15 AM, Michael Creel <michael.cr...@uab.es> wrote:
>
>
> On Mon, Mar 22, 2010 at 10:52 AM, Jaroslav Hajek <high...@gmail.com> wrote:
>>
>> On Mon, Mar 22, 2010 at 10:34 AM, Michael Creel <michael.cr...@uab.es>
>> wrote:
>> >
>> >
>> > On Sun, Mar 21, 2010 at 6:42 PM, Søren Hauberg <so...@hauberg.org>
>> > wrote:
>> >>
>> >> One problem with the current approach is that you currently can't use
>> >> function handles with 'bfgs'. In some situations they are more simply
>> >> to
>> >> use than strings.
>> >>
>> >> I don't think you'll see any noticeable difference in terms of speed. I
>> >> actually think you would be able to re-implement the entire function as
>> >> an m-file without seeing much loss of speed, as usually most of the
>> >> computation time is spend in the function being optimised rather than
>> >> in
>> >> the function doing the optimisation.
>> >>
>> >
>> > Well, fminunc in Octave is pretty much a perfect substitute for the
>> > functionality of bfgsmin.
>>
>> I don't think this is true. bfgsmin can do limited-memory updates,
>> fminunc can't. At present, fminunc is modelled after fsolve with its
>> trust-region approach. I didn't actually tune the fminunc
>> implementation at all, I thought that the trust-region approach will
>> be superior to line search of bfgsmin. It appears this is not really
>> true, though.
>>
>> At present, it seems to me that having the limited-memory updating
>> capability is more important than trust-region stepping. So I'd be
>> glad to reimplement fminunc using the line search strategy from
>> bfgsmin, allowing the switch between  inverse BFGS / factorized BFGS /
>> LBFGS updating.
>>
>> > It works with function handles, and is an .m file
>> > implementation. The advantage of bfgsmin is that it is faster. It is
>> > true
>> > that for expensive objective functions, the difference won't be
>> > important.
>> > However, if you are doing a lot of minimizations of cheap functions,
>> > then
>> > the difference is around 2X.
>>
>> I don't think this factor matters at all. Things can easily change
>> when new code optimizations become available, besides, our testing was
>> very limited and it was not in fact clear whether the factor 2x came
>> purely out of the code efficiency or whether bfgsmin just did less
>> work (less iterations).
>>
>> I probably will try to reimplement fminunc based on bfgsmin, but using
>> m-code, in not-so-distant future.
>> I'd appreciate any help on this.
>>
>
> That sounds good. My experience with bfgsmin is that having a stepsize
> algorithm that can switch between fast and pretty good (Newton step) to slow
> but reliable (bisection) works well. I'm not knowledgeable about trust
> regions. The other thing that has worked well is to fall back on steepest
> descent when the bfgs direction doesn't work. It is true that limited memory
> bfgs makes a big difference when there are many parameters.
>


One big advantage of a trust-region approach for fsolve is that it
makes it meaningful to update the Jacobian matrix after each step,
even if it increases the residual. This allows for incremental
correction of the Jacobian  by a series of bad steps, which can
eventually make it "repair" itself and then predict a good step,
without recomputing the Jacobian. I think this is one of the strong
pluses of the current fsolve implementation.
This is also true for minimization, but only if the update allows it.
Alas, the BFGS update requires the curvature condition, which is often
violated with bad steps, diminishing this advantage of a trust-region
approach. It would be possible to employ a non-definite Hessian update
like SR1, but then you lose the ability to solve the trust-region
subproblem via the efficient dogleg method and you need to resort to
more general approaches.

Back to practice: currently fminunc only updates the hessian after a
successful step, hence does not actually enjoy the primary benefit of
the trust-region approach. It could be fixed, but I would need to do
some more studying on (approximately) solving TR subproblems with
indefinite matrices (regarding the dogleg method, I just stole it from
MINPACK).
Unlike bfgsmin, it also can't use limited-memory updates. That's why I
think that rewriting it based on bfgsmin is a good idea at this
moment, because that's the best minimizer that Octave users had for a
while.



> My claim of a 2X factor is based on the casual testing you and I did a few
> months ago, plus testing I did when first moving from .m code to .cc. This
> factor clearly depends a lot upon the objective function used. It reflects
> the test code I was using, which is representative of my work, but others'
> mileage will certainly vary. There have also been a lot of improvements to
> Octave over that time, so I suspect that the factor would be closer to 1
> now.  I put a lot more weight upon reliability and crash-proofness than on
> pure speed. Using .m code probably would favor reliability.
>

I know that, but I don't recall us actually compare the numbers of
iterations. That being said, it is even possible that the cost per
iteration was actually higher for bfgsmin, and that it simply won by
performing significantly less iterations.

-- 
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Octave-dev mailing list
Octave-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/octave-dev

Re: [OctDev] lists in 'optim'

Reply via email to