Re: [NLopt-discuss] Runtime errors for large gradients

Steven G. Johnson Tue, 08 Mar 2011 08:25:56 -0800


On Mar 8, 2011, at 4:29 AM, [email protected] wrote:

Tracing through the code, this appears to be a problem in theoriginal
SLSQP software. SLSQP works by minimizing an approximate quadratic
subproblem at each step. At the first step of Alexander's 1d testcase,
SLSQP is minimizing:
min_x (x - s)^2
-1 <= x <= 1
I wonder whether this is really the problem actually solved by themethod. I must admit, I have not checked the code, and I am not sureabout SLSQP. The original problem is
min_x  f(x) = -s * x
g(x) = (x-1, -x-1)^T <= 0.
In a general SQP method, one would then solve
min_d  -s * d

You need to realize that SLSQP is a quasi-Newton method, not a Newtonmethod: it is sequential quadratic programming with an *approximate*Hessian (2nd derivative) that is built up iteratively.

That is, at each step, it solves a quadratic approximation of theoriginal objective, where the linear term is the gradient and thequadratic term is an approximate Hessian. The Hessian is constructedfrom BFGS updates, based on the gradients and function values atsubsequent steps, and like any BFGS method it only approaches the trueHessian in the limit of many iterations near a minimum.

In particular, for the very first step the Hessian is initialized tothe identity matrix (divided by 2), hence the (x - s)^2 objective.This is a typical choice for quasi-Newton methods. (Hence the initialsteps should depend on the magnitude of the gradient.)

My actual application is highly non-linear, 50-100 dimensional, andunfortunately too complex to do any rescaling. While LBFGS works, itis not really an option, since it needs too many gradientevaluations. SQP is the only chance to get results within reasonabletimes (less than 12 hours).


SLSQP is a BFGS method.  The main differences between it and LBFGS are:

1) it uses the original BFGS method with dense matrices, rather thanthe LBFGS sparse/low-rank/low-storage factorizations.

2) SLSQP handles nonlinear constrants (its subproblems are generalQPs: quadratic objectives, linear constraints) whereas LBFGS onlyhandles bound constraints.

So, if you have no nonlinear constraints I'm not sure why SLSQP wouldconverge significantly differently than any of the other quasi-Newtonmethods.


Steven

_______________________________________________
NLopt-discuss mailing list
[email protected]
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/nlopt-discuss

Re: [NLopt-discuss] Runtime errors for large gradients

Reply via email to