Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional minimisation problems.

Max Belushkin Tue, 29 Nov 2005 14:48:40 -0800

James, thank you, I will certainly give it a go in the next couple ofdays, and will let you know how it works out - most of the minimizationproblems in my field (theoretical nuclear physics) are very sensitive tosome parameters, and very insensitive to others - and that can "switcharound" at some points in parameter space - some parameters that "usedto be important" become next to irrelevant in that region, while theothers pick up in strength. Thank you very much for the link, and, asmentioned, I will update you on the results!


James Bergstra wrote:

Two things:
1.
Hmm, from what I read on the conjugate gradient algorithms, it'ssupposed to [basically] be a non-orthogonal version of the steepestdescent. But that picks the direction just based on the currentgradient, which is usually bad. Conjugate gradient picks the directionbased on the current gradient AND the previous gradient. Which means it
CG algorithms work by repeated line-minimization in R^n.  The minimizer picks a
point, your function comes back with the gradient at that point, and then the
minimizer finds a step-size that minimizes the value of your function along that
line.  If your function is more nonlinear in some directions than in others,
then this algorithm might not work very well.

2.
However, I never heard it only varies one parameter at a time - ifthat is the case, and is the reason for the plateaus, I really need togo searching for a better minimization routine, as this would not besuitable with so many parameters, which, yes, have very differentinfluence on the overall chi squared.
If this is the case with your function (or even if it is not) I'd appreciate it
if you tried out the minimization routine called
    msl_multimin_fdfminimizer_deltabardelta
that is implemented in my (recently-cleaned) libmsl
    http://savannah.nongnu.org/projects/libmsl

I've only ever used it on a single type of minimization (neural network gradient
descent) and I'd appreciate it if you let me know if it works well on your
function too.  It is designed to simultaneously optimize many dimensions in
which the non-linearity in different dimensions are very different.  It does
this by continually estimating the diagonal of the Hessian, to make more
efficient use of the gradient.


On Tue, Nov 29, 2005 at 10:47:17PM +0100, Max Belushkin wrote:
"knows" about the first step direction at any step, which, I guess, iswhy it's good to restart it from time to time.
Søren Christensen wrote:
Hello Max,
To the part two of your question.
I am no matematician either, but could it be that when you see thementioned plateau it is not actually a plateau, but one (or more) ofyour parameters has a very small influence on the RSQ? Eg several orderof magnitudes smaller than others? 19.9809->....<keep basically idle
for some 100s of iterations>->19.97->19.5->

is really
19.9809->....<19.9809000040 -> 19.9809000036 -> 19.9809000003 .....
->19.97->19.5->
I am not too much into the details of the minimizing, but i seem torecall it minimizes over one parameter at a time.
Is that a possibility?

Cheers
Soren



Fra: Max Belushkin
Sendt: on 30-11-2005 06:50
Til: [email protected]
Emne: [Help-gsl] On conjugate gradient algorithms in multidimentionalminimisation problems.
   Hello list,
I apologize in advance for this rather long email, it is a collectivedescription of the common problem I have failed to gain an understandingof, having tried for many months now.
Over the last couple of years, I have made some observations onconjugate gradient algorithms that I would like to seek some adviceabout. This experience is related to gsl-1.6 and gsl-1.7, as of late.
-- Part 1 --
The typical problem at hand is a parameter space of 20 dimensions ormore. That is, a chi-squared function is built, which is to beminimized, that depends on 20 or more independent parameters passed tothe minimization routine.
The most common problem is the minimizer eventually bailing out withan error of "iterations not progressing toward a solution" (the exactmessage may be a bit different, I do not remember off-hand). Thus, Itook up to restarting the minimizer withgsl_multimin_fdfminimizer_restart. However, it still does not help. Hereis what happens. If we take a toy sample from the examples, and modifyit a little, we get:
do {
tryagain:
 iter++;
 status = gsl_multimin_fdfminimizer_iterate(s);
if (status) {printf ("Restarting minimizer point 1\n");gsl_fdfminimizer_restart(s); goto tryagain;};
 status = gsl_multimin_test_gradient (s->gradient, 1e-3);
 if (status == GSL_SUCCESS) { printf ("converged to minimum\n"); break;};
if (status!=GSL_CONTINUE) { gsl_fdfminimizer_restart(s); printf("Restarted minimizer point 2\n"); status = GSL_CONTINUE;};
} while (status == GSL_CONTINUE && iter < 10000);
what happens is that we will always sit on "Restarting minimizer point1", and status will always be "iterations not progressing toward asolution". Removing the goto has the same effect, only I [without anyproof, just a feeling] believe it should also ruin the calls totest_gradient in some way.
Now, if we remove the restart at point 1 and break there, then takethe parameter vector we ended up with, plug it in, and start the programall over again, it will run for another 5000 iterations or so, then bailout with the same error at the same point. Repeat. And repeat...
At this point, I was somehow thinking it may be an issue with myparameter space. However, recently, my colleague, who has some code inFortran, asked me to come up with a C library employing GSL routinesthat would minimize his problem, which is completely different frommine, but the number of parameters is also large, around 15. So I cameup with a module that could be linked with a Fortran program, and theexact same phenomenon appears there.
I must admit I am no huge expert in the mathematics underlying themethods (Fletcher-Reeves, Polak-Ribiere andBroyden-Fletcher-Goldfarb-Shanno give the exact same results, up to anarbitrary constant in terms of the number of iterations), but I do knowthey have to be restarted from time to time. I did things like "if (iter% 50) gsl_fdfminimizer_restart(s)". No effect. So why does restartingthe iteration all over (restarting the program) from that point works,and sometimes reduces the chi squared quite significantly over severalsuch runs? And is there any way to force the minimizer to keep going, orhave any more control over the convergence process?
-- Part 2 --
This is somewhat related to Part 1. If I look at the convergence bythe value chi squared takes per iteration, a typical [example] picturelooks like this:
50->30->20.1->20.0->19.99->19.981->19.9809->....<keep basically idle
for some 100s of iterations>->19.97->19.5->19.0->18.3->
17.8->17.79->....<hang here for another century>...->...
The iteration process shows huge plateaus. What can this be anindication of? This is also very common and is seen on a wide range ofproblems, thus I am assuming this is a well known fact. Perhaps someonehas a solution of speeding up the iteration and not sitting on thoseplateaus where nothing seems to be happening much?
I apologize once again for the long email, and would be most gratefulfor any help on this subject.
 Thanks!



_______________________________________________
Help-gsl mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-gsl
_______________________________________________
Help-gsl mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-gsl



_______________________________________________
Help-gsl mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/help-gsl

Re: SV: [Help-gsl] On conjugate gradient algorithms in multidimentional minimisation problems.

Reply via email to