Just to add on Emil's answer: being the adjoint ode linear, you may either being not properly scaling the initial condition (if your objective is a final value one) or the adjoint forcing (i.e. the gradient wrt the state of the objective function if you have a cost gradient)
2017-11-22 18:34 GMT+03:00 Smith, Barry F. <[email protected]>: > > > > On Nov 22, 2017, at 3:48 AM, Julian Andrej <[email protected]> wrote: > > > > Hello, > > > > we prepared a small example which computes the gradient via the > continuous adjoint method of a heating problem with a cost functional. > > Julian, > > The first thing to note is that the continuous adjoint is not exactly > the same as the adjoint for the actual algebraic system you are solving. > (It is only, as I understand it possibly the same in the limit with very > fine mesh and time step). Thus you would not actually expect these to match > with PETSc fd. Now as your refine space/time do the numbers get closer to > each other? > > Note the angle cosine is very close to one which means that they are > producing the same search direction, just different lengths. > > How is the convergence of the solver if you use -tao_fd_gradient do you > still need unit. > > > but have to use -tao_ls_type unit. > > This is slightly odd, because this line search always just takes the > full step, the other ones would normally be better since they are more > sophisticated in picking the step size. Please run without the -tao_ls_type > unit. and send the output > > Also does your problem have bound constraints? If not use -tao_type > lmvm and send the output. > > Just saw Emil's email, yes there could easily be a scaling issue with > your continuous adjoint. > > Barry > > > > > > > We implemented the text book example and tested the gradient via a > Taylor Remainder (which works fine). Now we wanted to solve the > > optimization problem with TAO and checked the gradient vs. the finite > difference gradient and run into problems. > > > > Testing hand-coded gradient (hc) against finite difference gradient > (fd), if the ratio ||fd - hc|| / ||hc|| is > > 0 (1.e-8), the hand-coded gradient is probably correct. > > Run with -tao_test_display to show difference > > between hand-coded and finite difference gradient. > > ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = > (fd'hc)/||fd||||hc|| = 0.99768 > > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = > 0.00973464 > > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| = > 0.00243363 > > ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = > (fd'hc)/||fd||||hc|| = 0.997609 > > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = > 0.0253185 > > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| = > 0.00624562 > > ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = > (fd'hc)/||fd||||hc|| = 0.997338 > > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| = > 0.00585376 > > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| = > 0.00137836 > > > > Despite these differences we achieve convergence with our hand coded > gradient, but have to use -tao_ls_type unit. > > > > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol > 1e-7 -tao_ls_type unit > > iter = 0, Function value: 0.000316722, Residual: 0.00126285 > > iter = 1, Function value: 3.82272e-05, Residual: 0.000438094 > > iter = 2, Function value: 1.26011e-07, Residual: 8.4194e-08 > > Tao Object: 1 MPI processes > > type: blmvm > > Gradient steps: 0 > > TaoLineSearch Object: 1 MPI processes > > type: unit > > Active Set subset type: subvec > > convergence tolerances: gatol=1e-07, steptol=0., gttol=0. > > Residual in Function/Gradient:=8.4194e-08 > > Objective value=1.26011e-07 > > total number of iterations=2, (max: 2000) > > total number of function/gradient evaluations=3, (max: 4000) > > Solution converged: ||g(X)|| <= gatol > > > > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor > -tao_fd_gradient > > iter = 0, Function value: 0.000316722, Residual: 4.87343e-06 > > iter = 1, Function value: 0.000195676, Residual: 3.83011e-06 > > iter = 2, Function value: 1.26394e-07, Residual: 1.60262e-09 > > Tao Object: 1 MPI processes > > type: blmvm > > Gradient steps: 0 > > TaoLineSearch Object: 1 MPI processes > > type: more-thuente > > Active Set subset type: subvec > > convergence tolerances: gatol=1e-08, steptol=0., gttol=0. > > Residual in Function/Gradient:=1.60262e-09 > > Objective value=1.26394e-07 > > total number of iterations=2, (max: 2000) > > total number of function/gradient evaluations=3474, (max: 4000) > > Solution converged: ||g(X)|| <= gatol > > > > > > We think, that the finite difference gradient should be in line with our > hand coded gradient for such a simple example. > > > > We appreciate any hints on debugging this issue. It is implemented in > python (firedrake) and i can provide the code if this is needed. > > > > Regards > > Julian > > -- Stefano
