Hello, I have a question regarding the proper scaling of the gradient. Let's see...
I am solving a shape optimization problem, in which the independent variables are some physical shape parameters. These are zeroed at the initial value, and non-dimensionalized by a characteristic value of each parameter, multiplied by a global constant. Thus, what the optimizer works with are actually scaled increments from the initial solution's parameters. I make sure that the objective function is of order unity, and I use that global constant to tune the magnitude of the gradient. Here I give an example of the computed gradient for my objective function. Its square norm is around 0.3. I find that if the gradient norm is too big (actually order unity), the first iteration of gradient based algorithms (which use a fixed step size) usually gives an unfeasible solution. Scaling down the gradient gives an acceptable first iteration, but the problem comes afterwards. dF/dX = 0.00416414 -2.61692e-05 -0.00100618 -0.172974 0.248443 0.142556 0.0303844 -0.0389724 -0.0697649 -0.000480529 0.00117099 0.00125371 0.00105413 -0.0022282 -0.0503254 0.00962376 0.00525732 -0.0200041 Below there is a table with a sample convergence history before failure, using the Quadratic Programming algorithm. The first column is the iteration number, the second the objective function values, and the third is the square norm of the gradient. The latter features a plateau, I guess that due to to the inner iterations where the gradient is not computed, but I get similar results with other algorithms. What happens is that after an initial big step,the optimizer jumps back to initial values, and just stays there. Monitoring the changes in independent parameters, I made sure that that was what was actually happening. The similar value in objective function is not due to nonlinearity. 1.0000000000e+00 1.2774562543e+00 3.4979438214e-01 2.0000000000e+00 1.2869533443e+00 3.8457097663e-01 3.0000000000e+00 1.2852265489e+00 3.8457097663e-01 4.0000000000e+00 1.2816665890e+00 3.8457097663e-01 5.0000000000e+00 1.2798021926e+00 3.8457097663e-01 6.0000000000e+00 1.2788357691e+00 3.8457097663e-01 7.0000000000e+00 1.2782458402e+00 3.8457097663e-01 8.0000000000e+00 1.2779279344e+00 3.8457097663e-01 9.0000000000e+00 1.2777893678e+00 3.8457097663e-01 1.0000000000e+01 1.2776005145e+00 3.8457097663e-01 1.1000000000e+01 1.2775907753e+00 3.8457097663e-01 1.2000000000e+01 1.2775704352e+00 3.8457097663e-01 1.3000000000e+01 1.2775729016e+00 3.7110935948e-01 One guess would be that such a small gradient is not capable of driving the optimization. In the example I give, there are components of order 1e-5. I would understand that those ones would be ignored. But others are of order 1e-1. They should be able to contribute, shouldn't they? I would say that the gradient is correctly computed, as I work also with a hand coded steepest descent algorithm for debugging purposes, which while needing an awful lot of iterations, converges with no problem. Also I've checked against finite differences for some parameters, and it seems to be all right. Any advice? Regards, and sorry for the long post. _______________________________________________ NLopt-discuss mailing list [email protected] http://ab-initio.mit.edu/cgi-bin/mailman/listinfo/nlopt-discuss
