Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

2017-11-23 Thread Julian Andrej
It was indeed a mass scaling issue. We have to project the CADJ derived 
gradient to the corresponding FE space again.


Testing hand-coded gradient (hc) against finite difference gradient 
(fd), if the ratio ||fd - hc|| / ||hc|| is

0 (1.e-8), the hand-coded gradient is probably correct.
Run with -tao_test_display to show difference
between hand-coded and finite difference gradient.
||fd|| 0.000150841, ||hc|| = 0.000150841, angle cosine = 
(fd'hc)/||fdhc|| = 1.
2-norm ||fd-hc||/max(||hc||,||fd||) = 4.48554e-06, difference ||fd-hc|| 
= 6.76604e-10
max-norm ||fd-hc||/max(||hc||,||fd||) = 4.99792e-06, difference 
||fd-hc|| = 1.88044e-10
||fd|| 0.000386312, ||hc|| = 0.000386312, angle cosine = 
(fd'hc)/||fdhc|| = 1.
2-norm ||fd-hc||/max(||hc||,||fd||) = 1.14682e-05, difference ||fd-hc|| 
= 4.4303e-09
max-norm ||fd-hc||/max(||hc||,||fd||) = 1.56645e-05, difference 
||fd-hc|| = 1.49275e-09
||fd|| 8.46797e-05, ||hc|| = 8.46797e-05, angle cosine = 
(fd'hc)/||fdhc|| = 1.
2-norm ||fd-hc||/max(||hc||,||fd||) = 2.63488e-06, difference ||fd-hc|| 
= 2.2312e-10
max-norm ||fd-hc||/max(||hc||,||fd||) = 2.7873e-06, difference ||fd-hc|| 
= 5.58718e-11


Thank you all for the quick responses and input again!

On 2017-11-23 09:29, Julian Andrej wrote:

On 2017-11-22 16:27, Emil Constantinescu wrote:

On 11/22/17 3:48 AM, Julian Andrej wrote:

Hello,

we prepared a small example which computes the gradient via the 
continuous adjoint method of a heating problem with a cost 
functional.


We implemented the text book example and tested the gradient via a 
Taylor Remainder (which works fine). Now we wanted to solve the
optimization problem with TAO and checked the gradient vs. the finite 
difference gradient and run into problems.


Testing hand-coded gradient (hc) against finite difference gradient 
(fd), if the ratio ||fd - hc|| / ||hc|| is

0 (1.e-8), the hand-coded gradient is probably correct.
Run with -tao_test_display to show difference
between hand-coded and finite difference gradient.
||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = 
(fd'hc)/||fdhc|| = 0.99768
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| 
= 0.00973464
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference 
||fd-hc|| = 0.00243363
||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = 
(fd'hc)/||fdhc|| = 0.997609
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| 
= 0.0253185
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference 
||fd-hc|| = 0.00624562
||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = 
(fd'hc)/||fdhc|| = 0.997338
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| 
= 0.00585376
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference 
||fd-hc|| = 0.00137836


Despite these differences we achieve convergence with our hand coded 
gradient, but have to use -tao_ls_type unit.


Both give similar (assume descent) directions, but seem to be scaled
differently. It could be a bad scaling by the mass matrix somewhere in
the continuous adjoint. This could be seen if you plot them side by
side as a quick diagnostic.



I visualized and attached the two gradients. The CADJ is hand coded and
the DADJ is from pyadjoint which is the same as the finite difference
gradient from TAO.

If the attachement gets lost in the mailing list,, here is a direct 
link [1]


[1] https://cloud.tf.uni-kiel.de/index.php/s/nmiNOoI213dx1L1


Emil

$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor 
-tao_gatol 1e-7 -tao_ls_type unit

iter =   0, Function value: 0.000316722,  Residual: 0.00126285
iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
Tao Object: 1 MPI processes
   type: blmvm
   Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: unit
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=8.4194e-08
   Objective value=1.26011e-07
   total number of iterations=2,  (max: 2000)
   total number of function/gradient evaluations=3,  (max: 4000)
   Solution converged:    ||g(X)|| <= gatol

$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor 
-tao_fd_gradient

iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
Tao Object: 1 MPI processes
   type: blmvm
   Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: more-thuente
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=1.60262e-09
   Objective value=1.26394e-07
   total number of iterations=2,  (max: 2000)
   total number of function/gradient evaluations=3474,  (max: 
4000)

   Solution converged:    ||g(X)|| <= gatol


We 

Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

2017-11-22 Thread Zhang, Hong
Hi Julian,

If I remember correctly, you have a code that worked fine with discrete adjoint 
(TSAdjoint). Was it for the same example? If so, how are the differences in the 
validation output between continuous adjoint and discrete adjoint? 

Hong (Mr.)

> On Nov 22, 2017, at 3:48 AM, Julian Andrej  wrote:
> 
> Hello,
> 
> we prepared a small example which computes the gradient via the continuous 
> adjoint method of a heating problem with a cost functional.
> 
> We implemented the text book example and tested the gradient via a Taylor 
> Remainder (which works fine). Now we wanted to solve the
> optimization problem with TAO and checked the gradient vs. the finite 
> difference gradient and run into problems.
> 
> Testing hand-coded gradient (hc) against finite difference gradient (fd), if 
> the ratio ||fd - hc|| / ||hc|| is
> 0 (1.e-8), the hand-coded gradient is probably correct.
> Run with -tao_test_display to show difference
> between hand-coded and finite difference gradient.
> ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = (fd'hc)/||fdhc|| 
> = 0.99768
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
> 0.00973464
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| = 
> 0.00243363
> ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = (fd'hc)/||fdhc|| = 
> 0.997609
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
> 0.0253185
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| = 
> 0.00624562
> ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = (fd'hc)/||fdhc|| 
> = 0.997338
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| = 
> 0.00585376
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| = 
> 0.00137836
> 
> Despite these differences we achieve convergence with our hand coded 
> gradient, but have to use -tao_ls_type unit.
> 
> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol 1e-7 
> -tao_ls_type unit
> iter =   0, Function value: 0.000316722,  Residual: 0.00126285
> iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
> iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
> Tao Object: 1 MPI processes
>  type: blmvm
>  Gradient steps: 0
>  TaoLineSearch Object: 1 MPI processes
>type: unit
>  Active Set subset type: subvec
>  convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
>  Residual in Function/Gradient:=8.4194e-08
>  Objective value=1.26011e-07
>  total number of iterations=2,  (max: 2000)
>  total number of function/gradient evaluations=3,  (max: 4000)
>  Solution converged:||g(X)|| <= gatol
> 
> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_fd_gradient
> iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
> iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
> iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
> Tao Object: 1 MPI processes
>  type: blmvm
>  Gradient steps: 0
>  TaoLineSearch Object: 1 MPI processes
>type: more-thuente
>  Active Set subset type: subvec
>  convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
>  Residual in Function/Gradient:=1.60262e-09
>  Objective value=1.26394e-07
>  total number of iterations=2,  (max: 2000)
>  total number of function/gradient evaluations=3474,  (max: 4000)
>  Solution converged:||g(X)|| <= gatol
> 
> 
> We think, that the finite difference gradient should be in line with our hand 
> coded gradient for such a simple example.
> 
> We appreciate any hints on debugging this issue. It is implemented in python 
> (firedrake) and i can provide the code if this is needed.
> 
> Regards
> Julian



Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

2017-11-22 Thread Stefano Zampini
Just to add on Emil's answer: being the adjoint ode linear, you may either
being not properly scaling the initial condition (if your objective is a
final value one) or the adjoint forcing (i.e. the gradient wrt the state of
the objective function if you have a cost gradient)

2017-11-22 18:34 GMT+03:00 Smith, Barry F. :

>
>
> > On Nov 22, 2017, at 3:48 AM, Julian Andrej  wrote:
> >
> > Hello,
> >
> > we prepared a small example which computes the gradient via the
> continuous adjoint method of a heating problem with a cost functional.
>
>Julian,
>
>  The first thing to note is that the continuous adjoint is not exactly
> the same as the adjoint for the actual algebraic system you are solving.
> (It is only, as I understand it possibly the same in the limit with very
> fine mesh and time step). Thus you would not actually expect these to match
> with PETSc fd. Now as your refine space/time do the numbers get closer to
> each other?
>
>   Note the angle cosine is very close to one which means that they are
> producing the same search direction, just different lengths.
>
>How is the convergence of the solver if you use -tao_fd_gradient do you
> still need unit.
>
> > but have to use -tao_ls_type unit.
>
>This is slightly odd, because this line search always just takes the
> full step, the other ones would normally be better since they are more
> sophisticated in picking the step size. Please run without the -tao_ls_type
> unit.  and send the output
>
>Also does your problem have bound constraints? If not use -tao_type
> lmvm  and send the output.
>
>Just saw Emil's email, yes there could easily be a scaling issue with
> your continuous adjoint.
>
>   Barry
>
>
>
> >
> > We implemented the text book example and tested the gradient via a
> Taylor Remainder (which works fine). Now we wanted to solve the
> > optimization problem with TAO and checked the gradient vs. the finite
> difference gradient and run into problems.
> >
> > Testing hand-coded gradient (hc) against finite difference gradient
> (fd), if the ratio ||fd - hc|| / ||hc|| is
> > 0 (1.e-8), the hand-coded gradient is probably correct.
> > Run with -tao_test_display to show difference
> > between hand-coded and finite difference gradient.
> > ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine =
> (fd'hc)/||fdhc|| = 0.99768
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| =
> 0.00973464
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| =
> 0.00243363
> > ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine =
> (fd'hc)/||fdhc|| = 0.997609
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| =
> 0.0253185
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| =
> 0.00624562
> > ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine =
> (fd'hc)/||fdhc|| = 0.997338
> > 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| =
> 0.00585376
> > max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| =
> 0.00137836
> >
> > Despite these differences we achieve convergence with our hand coded
> gradient, but have to use -tao_ls_type unit.
> >
> > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol
> 1e-7 -tao_ls_type unit
> > iter =   0, Function value: 0.000316722,  Residual: 0.00126285
> > iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
> > iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
> > Tao Object: 1 MPI processes
> >  type: blmvm
> >  Gradient steps: 0
> >  TaoLineSearch Object: 1 MPI processes
> >type: unit
> >  Active Set subset type: subvec
> >  convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
> >  Residual in Function/Gradient:=8.4194e-08
> >  Objective value=1.26011e-07
> >  total number of iterations=2,  (max: 2000)
> >  total number of function/gradient evaluations=3,  (max: 4000)
> >  Solution converged:||g(X)|| <= gatol
> >
> > $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor
> -tao_fd_gradient
> > iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
> > iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
> > iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
> > Tao Object: 1 MPI processes
> >  type: blmvm
> >  Gradient steps: 0
> >  TaoLineSearch Object: 1 MPI processes
> >type: more-thuente
> >  Active Set subset type: subvec
> >  convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
> >  Residual in Function/Gradient:=1.60262e-09
> >  Objective value=1.26394e-07
> >  total number of iterations=2,  (max: 2000)
> >  total number of function/gradient evaluations=3474,  (max: 4000)
> >  Solution converged:||g(X)|| <= gatol
> >
> >
> > We think, that the finite difference gradient should be in line with our
> hand coded gradient for such a simple example.

Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

2017-11-22 Thread Smith, Barry F.


> On Nov 22, 2017, at 3:48 AM, Julian Andrej  wrote:
> 
> Hello,
> 
> we prepared a small example which computes the gradient via the continuous 
> adjoint method of a heating problem with a cost functional.

   Julian,

 The first thing to note is that the continuous adjoint is not exactly the 
same as the adjoint for the actual algebraic system you are solving. (It is 
only, as I understand it possibly the same in the limit with very fine mesh and 
time step). Thus you would not actually expect these to match with PETSc fd. 
Now as your refine space/time do the numbers get closer to each other?

  Note the angle cosine is very close to one which means that they are 
producing the same search direction, just different lengths.

   How is the convergence of the solver if you use -tao_fd_gradient do you 
still need unit.

> but have to use -tao_ls_type unit.

   This is slightly odd, because this line search always just takes the full 
step, the other ones would normally be better since they are more sophisticated 
in picking the step size. Please run without the -tao_ls_type unit.  and send 
the output

   Also does your problem have bound constraints? If not use -tao_type lmvm  
and send the output. 

   Just saw Emil's email, yes there could easily be a scaling issue with your 
continuous adjoint.

  Barry



> 
> We implemented the text book example and tested the gradient via a Taylor 
> Remainder (which works fine). Now we wanted to solve the
> optimization problem with TAO and checked the gradient vs. the finite 
> difference gradient and run into problems.
> 
> Testing hand-coded gradient (hc) against finite difference gradient (fd), if 
> the ratio ||fd - hc|| / ||hc|| is
> 0 (1.e-8), the hand-coded gradient is probably correct.
> Run with -tao_test_display to show difference
> between hand-coded and finite difference gradient.
> ||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = (fd'hc)/||fdhc|| 
> = 0.99768
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
> 0.00973464
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| = 
> 0.00243363
> ||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = (fd'hc)/||fdhc|| = 
> 0.997609
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
> 0.0253185
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| = 
> 0.00624562
> ||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = (fd'hc)/||fdhc|| 
> = 0.997338
> 2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| = 
> 0.00585376
> max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| = 
> 0.00137836
> 
> Despite these differences we achieve convergence with our hand coded 
> gradient, but have to use -tao_ls_type unit.
> 
> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol 1e-7 
> -tao_ls_type unit
> iter =   0, Function value: 0.000316722,  Residual: 0.00126285
> iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
> iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
> Tao Object: 1 MPI processes
>  type: blmvm
>  Gradient steps: 0
>  TaoLineSearch Object: 1 MPI processes
>type: unit
>  Active Set subset type: subvec
>  convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
>  Residual in Function/Gradient:=8.4194e-08
>  Objective value=1.26011e-07
>  total number of iterations=2,  (max: 2000)
>  total number of function/gradient evaluations=3,  (max: 4000)
>  Solution converged:||g(X)|| <= gatol
> 
> $ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_fd_gradient
> iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
> iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
> iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
> Tao Object: 1 MPI processes
>  type: blmvm
>  Gradient steps: 0
>  TaoLineSearch Object: 1 MPI processes
>type: more-thuente
>  Active Set subset type: subvec
>  convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
>  Residual in Function/Gradient:=1.60262e-09
>  Objective value=1.26394e-07
>  total number of iterations=2,  (max: 2000)
>  total number of function/gradient evaluations=3474,  (max: 4000)
>  Solution converged:||g(X)|| <= gatol
> 
> 
> We think, that the finite difference gradient should be in line with our hand 
> coded gradient for such a simple example.
> 
> We appreciate any hints on debugging this issue. It is implemented in python 
> (firedrake) and i can provide the code if this is needed.
> 
> Regards
> Julian



Re: [petsc-users] TAO: Finite Difference vs Continuous Adjoint gradient issues

2017-11-22 Thread Emil Constantinescu

On 11/22/17 3:48 AM, Julian Andrej wrote:

Hello,

we prepared a small example which computes the gradient via the 
continuous adjoint method of a heating problem with a cost functional.


We implemented the text book example and tested the gradient via a 
Taylor Remainder (which works fine). Now we wanted to solve the
optimization problem with TAO and checked the gradient vs. the finite 
difference gradient and run into problems.


Testing hand-coded gradient (hc) against finite difference gradient 
(fd), if the ratio ||fd - hc|| / ||hc|| is

0 (1.e-8), the hand-coded gradient is probably correct.
Run with -tao_test_display to show difference
between hand-coded and finite difference gradient.
||fd|| 0.000147076, ||hc|| = 0.00988136, angle cosine = 
(fd'hc)/||fdhc|| = 0.99768
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
0.00973464
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985149, difference ||fd-hc|| = 
0.00243363
||fd|| 0.000382547, ||hc|| = 0.0257001, angle cosine = 
(fd'hc)/||fdhc|| = 0.997609
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985151, difference ||fd-hc|| = 
0.0253185
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985117, difference ||fd-hc|| = 
0.00624562
||fd|| 8.84429e-05, ||hc|| = 0.00594196, angle cosine = 
(fd'hc)/||fdhc|| = 0.997338
2-norm ||fd-hc||/max(||hc||,||fd||) = 0.985156, difference ||fd-hc|| = 
0.00585376
max-norm ||fd-hc||/max(||hc||,||fd||) = 0.985006, difference ||fd-hc|| = 
0.00137836


Despite these differences we achieve convergence with our hand coded 
gradient, but have to use -tao_ls_type unit.


Both give similar (assume descent) directions, but seem to be scaled 
differently. It could be a bad scaling by the mass matrix somewhere in 
the continuous adjoint. This could be seen if you plot them side by side 
as a quick diagnostic.


Emil

$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor -tao_gatol 
1e-7 -tao_ls_type unit

iter =   0, Function value: 0.000316722,  Residual: 0.00126285
iter =   1, Function value: 3.82272e-05,  Residual: 0.000438094
iter =   2, Function value: 1.26011e-07,  Residual: 8.4194e-08
Tao Object: 1 MPI processes
   type: blmvm
   Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: unit
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-07,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=8.4194e-08
   Objective value=1.26011e-07
   total number of iterations=2,  (max: 2000)
   total number of function/gradient evaluations=3,  (max: 4000)
   Solution converged:    ||g(X)|| <= gatol

$ python heat_adj.py -tao_type blmvm -tao_view -tao_monitor 
-tao_fd_gradient

iter =   0, Function value: 0.000316722,  Residual: 4.87343e-06
iter =   1, Function value: 0.000195676,  Residual: 3.83011e-06
iter =   2, Function value: 1.26394e-07,  Residual: 1.60262e-09
Tao Object: 1 MPI processes
   type: blmvm
   Gradient steps: 0
   TaoLineSearch Object: 1 MPI processes
     type: more-thuente
   Active Set subset type: subvec
   convergence tolerances: gatol=1e-08,   steptol=0.,   gttol=0.
   Residual in Function/Gradient:=1.60262e-09
   Objective value=1.26394e-07
   total number of iterations=2,  (max: 2000)
   total number of function/gradient evaluations=3474,  (max: 4000)
   Solution converged:    ||g(X)|| <= gatol


We think, that the finite difference gradient should be in line with our 
hand coded gradient for such a simple example.


We appreciate any hints on debugging this issue. It is implemented in 
python (firedrake) and i can provide the code if this is needed.


Regards
Julian