[NLopt-discuss] Gradient scaling

Ricardo Puente Tue, 19 Feb 2013 03:23:09 -0800

Hello,

I have a question regarding the proper scaling of the gradient. Let's see...


I am solving a shape optimization problem, in which the independent variables
are some physical shape parameters. These are zeroed at the initial value, 
and non-dimensionalized by a characteristic value of each parameter, 
multiplied by a global constant.

Thus, what the optimizer works with are actually scaled increments
from the initial solution's parameters. I make sure that the objective 
function is of order unity, and I use that global constant to tune 
the magnitude of the gradient.

Here I give an example of the computed gradient for my objective function.
Its square norm is around 0.3. I find that if the gradient norm is too
big (actually order unity), the first iteration of gradient based algorithms
(which use a fixed step size) usually gives an unfeasible solution.
Scaling down the gradient gives an acceptable first iteration, but the
problem comes afterwards. 

dF/dX = 
0.00416414      
-2.61692e-05    
-0.00100618     
-0.172974   
 0.248443       
0.142556        
0.0303844  
-0.0389724
-0.0697649
-0.000480529    
0.00117099      
0.00125371
0.00105413
-0.0022282
-0.0503254      
0.00962376      
0.00525732      
-0.0200041


Below there is a table with a sample convergence history before failure, using
the Quadratic Programming algorithm. The first column is the iteration number,
the second the objective function values, and the third is the square norm of
the gradient. The latter features a plateau, I guess that due to to the inner
iterations where the gradient is not computed, but I get similar results with
other algorithms. What happens is that after an initial big step,the optimizer
jumps back to initial values, and just stays there. Monitoring the changes in
independent parameters, I made sure that that was what was actually happening.
The similar value in objective function is not due to nonlinearity.

  1.0000000000e+00    1.2774562543e+00   3.4979438214e-01
  2.0000000000e+00    1.2869533443e+00   3.8457097663e-01
  3.0000000000e+00    1.2852265489e+00   3.8457097663e-01
  4.0000000000e+00    1.2816665890e+00   3.8457097663e-01
  5.0000000000e+00    1.2798021926e+00   3.8457097663e-01
  6.0000000000e+00    1.2788357691e+00   3.8457097663e-01
  7.0000000000e+00    1.2782458402e+00   3.8457097663e-01
  8.0000000000e+00    1.2779279344e+00   3.8457097663e-01
  9.0000000000e+00    1.2777893678e+00   3.8457097663e-01
  1.0000000000e+01    1.2776005145e+00   3.8457097663e-01
  1.1000000000e+01    1.2775907753e+00   3.8457097663e-01
  1.2000000000e+01    1.2775704352e+00   3.8457097663e-01
  1.3000000000e+01    1.2775729016e+00   3.7110935948e-01


One guess would be that such a small gradient is not capable of driving the
optimization. In the example I give, there are components of order 1e-5. I would
understand that those ones would be ignored. But others are of order 1e-1. They
should be able to contribute, shouldn't they?

I would say that the gradient is correctly computed, as I work also with a hand
coded steepest descent algorithm for debugging purposes, which while needing an
awful lot of iterations, converges with no problem. Also I've checked against
finite differences for some parameters, and it seems to be all right.

Any advice?


Regards, and sorry for the long post.



_______________________________________________
NLopt-discuss mailing list
[email protected]
http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/nlopt-discuss

[NLopt-discuss] Gradient scaling

Reply via email to