Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

Julian Andrej Thu, 23 Nov 2017 02:17:31 -0800

It was indeed a mass scaling issue. We have to project the CADJ derivedgradient to the corresponding FE space again.

Testing hand-coded gradient (hc) against finite difference gradient(fd), if the ratio ||fd - hc|| / ||hc|| is

0 (1.e-8), the hand-coded gradient is probably correct.
Run with -tao_test_display to show difference
between hand-coded and finite difference gradient.

||fd|| 0.000150841, ||hc|| = 0.000150841, angle cosine =(fd'hc)/||fd||||hc|| = 1.2-norm ||fd-hc||/max(||hc||,||fd||) = 4.48554e-06, difference ||fd-hc||= 6.76604e-10max-norm ||fd-hc||/max(||hc||,||fd||) = 4.99792e-06, difference||fd-hc|| = 1.88044e-10||fd|| 0.000386312, ||hc|| = 0.000386312, angle cosine =(fd'hc)/||fd||||hc|| = 1.2-norm ||fd-hc||/max(||hc||,||fd||) = 1.14682e-05, difference ||fd-hc||= 4.4303e-09max-norm ||fd-hc||/max(||hc||,||fd||) = 1.56645e-05, difference||fd-hc|| = 1.49275e-09||fd|| 8.46797e-05, ||hc|| = 8.46797e-05, angle cosine =(fd'hc)/||fd||||hc|| = 1.2-norm ||fd-hc||/max(||hc||,||fd||) = 2.63488e-06, difference ||fd-hc||= 2.2312e-10max-norm ||fd-hc||/max(||hc||,||fd||) = 2.7873e-06, difference ||fd-hc||= 5.58718e-11


Thank you all for the quick responses and input again!

On 2017-11-23 09:29, Julian Andrej wrote:

On 2017-11-22 16:27, Emil Constantinescu wrote:
On 11/22/17 3:48 AM, Julian Andrej wrote:
Hello,
we prepared a small example which computes the gradient via thecontinuous adjoint method of a heating problem with a costfunctional.
We implemented the text book example and tested the gradient via aTaylor Remainder (which works fine). Now we wanted to solve theoptimization problem with TAO and checked the gradient vs. the finitedifference gradient and run into problems.
Testing hand-coded gradient (hc) against finite difference gradient(fd), if the ratio ||fd - hc|| / ||hc|| is
0 (1.e-8), the hand-coded gradient is probably correct.
Run with -tao_test_display to show difference
between hand-coded and finite difference gradient.
||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine =(fd'hc)/||fd||||hc|| = 0.997682-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc||= 0.00973464max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference||fd-hc|| = 0.00243363||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine =(fd'hc)/||fd||||hc|| = 0.9976092-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc||= 0.0253185max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference||fd-hc|| = 0.00624562||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine =(fd'hc)/||fd||||hc|| = 0.9973382-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc||= 0.00585376max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference||fd-hc|| = 0.00137836
Despite these differences we achieve convergence with our hand codedgradient, but have to use -tao_ls_type unit.
Both give similar (assume descent) directions, but seem to be scaled
differently. It could be a bad scaling by the mass matrix somewhere in
the continuous adjoint. This could be seen if you plot them side by
side as a quick diagnostic.
I visualized and attached the two gradients. The CADJ is hand coded and
the DADJ is from pyadjoint which is the same as the finite difference
gradient from TAO.
If the attachement gets lost in the mailing list,, here is a directlink [1]
[1] https://cloud.tf.uni-kiel.de/index.php/s/nmiNOoI213dx1L1
Emil
$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor-tao_gatol 1e-7 -tao_ls_type unit
iter =   0, Function value: 0.000316722,  Residual: 0.00126285
iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
Tao Object: 1 MPI processes
   type: blmvm
       Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: unit
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=8.4194e-08
   Objective value=1.26011e-07
   total number of iterations=2,                          (max: 2000)
   total number of function/gradient evaluations=3,      (max: 4000)
   Solution converged:    ||g(X)|| <= gatol
$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor-tao_fd_gradient
iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
Tao Object: 1 MPI processes
   type: blmvm
       Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: more-thuente
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=1.60262e-09
   Objective value=1.26394e-07
   total number of iterations=2,                          (max: 2000)
total number of function/gradient evaluations=3474, (max:4000)
   Solution converged:    ||g(X)|| <= gatol
We think, that the finite difference gradient should be in line withour hand coded gradient for such a simple example.
We appreciate any hints on debugging this issue. It is implemented inpython (firedrake) and i can provide the code if this is needed.
Regards
Julian

Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

Reply via email to