I am still working to understand. I have a PETSc branch
barry/2022-11-11/fixes-for-tao/release where I have made a few fix/improvements
to help me run and debug with your code.
I made a tiny change to your code, passing Hessian twice, and ran with
./test_tao_neohooke -tao_monitor -tao_view -tao_max_it
500 -tao_converged_reason -tao_lmvm_recycle -tao_type nls -tao_ls_monitor
and got
18 TAO, Function value: -0.0383888, Residual: 7.46748e-11
TAO solve converged due to CONVERGED_GATOL iterations 18
Is this what you expect? Also works with ntr
If I run with
./test_tao_neohooke -tao_monitor -tao_view -tao_max_it
10000 -tao_converged_reason -tao_type lmvm -tao_ls_monitor
I get
2753 TAO, Function value: -0.0161685, Residual: 0.120782
0 LS Function value: -0.0161685, Step length: 0.
1 LS Function value: 4.49423e+307, Step length: 1.
stx: 0., fx: -0.0161685, dgx: -0.0145883
sty: 0., fy: -0.0161685, dgy: -0.0145883
2 LS Function value: -0.0161685, Step length: 0.
stx: 0., fx: -0.0161685, dgx: -0.0145883
sty: 1., fy: 4.49423e+307, dgy: 5.68594e+307
2754 TAO, Function value: -0.0161685, Residual: 0.120782
TAO solve did not converge due to DIVERGED_LS_FAILURE iteration 2754
Note the insane fy value that pops up at the end.
The next one
./test_tao_neohooke -tao_monitor -tao_view -tao_max_it
500 -tao_converged_reason -tao_lmvm_recycle -tao_type owlqn -tao_ls_monitor
0 TAO, Function value: 0., Residual: 0.
TAO solve converged due to CONVERGED_GATOL iterations 0
fails right off the bat, somehow the initial residual norm is 0, which should
not depend on the solver (maybe a bug in Tao?)
bmrm gets stuck far from the minimum found by the Newton methods.
1719 TAO, Function value: -2.36706e-06, Residual: 1.94494e-09
I realize this is still far from the problem you reported (and I agree is a
bug), I am working to understand enough to provide a proper fix to the bug
instead of just doing something ad hoc.
Barry
> On Nov 4, 2022, at 7:43 AM, Stephan Köhler
> <[email protected]> wrote:
>
> Barry,
>
> this is a nonartificial code. This is a problem in the ALMM subsolver. I
> want to solve a problem with a TaoALMM solver what then happens is:
>
> TaoSolve(tao) /* TaoALMM solver */
> |
> |
> |--------> This calls the TaoALMM subsolver routine
>
> TaoSolve(subsolver)
> |
> |
> |-----------> The subsolver does not correctly work,
> at least with an Armijo line search, since the solution is overwritten within
> the line search.
> In my case, the subsolver does not
> make any progress although it is possible.
>
> To get to my real problem you can simply change line 268 to if(0) (from
> if(1) -----> if(0)) and line 317 from // ierr = TaoSolve(tao); CHKERRQ(ierr);
> -------> ierr = TaoSolve(tao); CHKERRQ(ierr);
> What you can see is that the solver does not make any progress, but it should
> make progress.
>
> To be honest, I do not really know why the option
> -tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is
> called and not the subsolver. I also do not know why
> -tao_almm_subsolver_tao_view prints as termination reason for the subsolver
>
> Solution converged: ||g(X)|| <= gatol
>
> This is obviously not the case. I set the tolerance
> -tao_almm_subsolver_tao_gatol 1e-8 \
> -tao_almm_subsolver_tao_grtol 1e-8 \
>
> I encountered this and then I looked into the ALMM class and therefore I
> tried to call the subsolver (previous example).
>
> I attach the updated programm and also the options.
>
> Stephan
>
>
>
>
>
> <https://www.dict.cc/?s=obviously>
> On 03.11.22 22:15, Barry Smith wrote:
>>
>> Thanks for your response and the code. I understand the potential problem
>> and how your code demonstrates a bug if the TaoALMMSubsolverObjective() is
>> used in the manner you use in the example where you directly call
>> TaoComputeObjective() multiple times line a line search code might.
>>
>> What I don't have or understand is how to reproduce the problem in a real
>> code that uses Tao. That is where the Tao Armijo line search code has a
>> problem when it is used (somehow) in a Tao solver with ALMM. You suggest "If
>> you have an example for your own, you can switch the Armijo line search by
>> the option -tao_ls_type armijo. The thing is that it will cause no problems
>> if the line search accepts the steps with step length one." I don't see how
>> to do this if I use -tao_type almm I cannot use -tao_ls_type armijo; that is
>> the option -tao_ls_type doesn't seem to me to be usable in the context of
>> almm (since almm internally does directly its own trust region approach for
>> globalization). If we remove the if (1) code from your example, is there
>> some Tao options I can use to get the bug to appear inside the Tao solve?
>>
>> I'll try to explain again, I agree that the fact that the Tao solution is
>> aliased (within the ALMM solver) is a problem with repeated calls to
>> TaoComputeObjective() but I cannot see how these repeated calls could ever
>> happen in the use of TaoSolve() with the ALMM solver. That is when is this
>> "design problem" a true problem as opposed to just a potential problem that
>> can be demonstrated in artificial code?
>>
>> The reason I need to understand the non-artificial situation it breaks
>> things is to come up with an appropriate correction for the current code.
>>
>> Barry
>>
>>
>>
>>
>>
>>
>>
>>> On Nov 3, 2022, at 12:46 PM, Stephan Köhler
>>> <[email protected]>
>>> <mailto:[email protected]> wrote:
>>>
>>> Barry,
>>>
>>> so far, I have not experimented with trust-region methods, but I can
>>> imagine that this "design feature" causes no problem for trust-region
>>> methods, if the old point is saved and after the trust-region check fails
>>> the old point is copied to the actual point. But the implementation of the
>>> Armijo line search method does not work that way. Here, the actual point
>>> will always be overwritten. Only if the line search fails, then the old
>>> point is restored, but then the TaoSolve method ends with a line search
>>> failure.
>>>
>>> If you have an example for your own, you can switch the Armijo line search
>>> by the option -tao_ls_type armijo. The thing is that it will cause no
>>> problems if the line search accepts the steps with step length one.
>>> It is also possible that, by luck, it will cause no problems, if the
>>> "excessive" step brings a reduction of the objective
>>>
>>> Otherwise, I attach my example, which is not minimal, but here you can see
>>> that it causes problems. You need to set the paths to the PETSc library in
>>> the makefile. You find the options for this problem in the
>>> run_test_tao_neohooke.sh script.
>>> The import part begins at line 292 in test_tao_neohooke.cpp
>>>
>>> Stephan
>>>
>>> On 02.11.22 19:04, Barry Smith wrote:
>>>> Stephan,
>>>>
>>>> I have located the troublesome line in TaoSetUp_ALMM() it has the line
>>>>
>>>> auglag->Px = tao->solution;
>>>>
>>>> and in alma.h it has
>>>>
>>>> Vec Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */
>>>>
>>>> Now auglag->P in some situations alias auglag->P and in some cases
>>>> auglag->Px serves to hold a portion of auglag->P. So then in
>>>> TaoALMMSubsolverObjective_Private()
>>>> the lines
>>>>
>>>> PetscCall(VecCopy(P, auglag->P));
>>>> PetscCall((*auglag->sub_obj)(auglag->parent));
>>>>
>>>> causes, just as you said, tao->solution to be overwritten by the P at
>>>> which the objective function is being computed. In other words, the
>>>> solution of the outer Tao is aliased with the solution of the inner Tao,
>>>> by design.
>>>>
>>>> You are definitely correct, the use of TaoALMMSubsolverObjective_Private
>>>> and TaoALMMSubsolverObjectiveAndGradient_Private in a line search would
>>>> be problematic.
>>>>
>>>> I am not an expert at these methods or their implementations. Could you
>>>> point to an actual use case within Tao that triggers the problem. Is there
>>>> a set of command line options or code calls to Tao that fail due to this
>>>> "design feature". Within the standard use of ALMM I do not see how the
>>>> objective function would be used within a line search. The TaoSolve_ALMM()
>>>> code is self-correcting in that if a trust region check fails it
>>>> automatically rolls back the solution.
>>>>
>>>> Barry
>>>>
>>>>
>>>>
>>>>
>>>>> On Oct 28, 2022, at 4:27 AM, Stephan Köhler
>>>>> <[email protected]>
>>>>> <mailto:[email protected]> wrote:
>>>>>
>>>>> Dear PETSc/Tao team,
>>>>>
>>>>> it seems to be that there is a bug in the TaoALMM class:
>>>>>
>>>>> In the methods TaoALMMSubsolverObjective_Private and
>>>>> TaoALMMSubsolverObjectiveAndGradient_Private the vector where the
>>>>> function value for the augmented Lagrangian is evaluate
>>>>> is copied into the current solution, see, e.g.,
>>>>> https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html line
>>>>> 672 or 682. This causes subsolver routine to not converge if the line
>>>>> search for the subsolver rejects the step length 1. for some
>>>>> update. In detail:
>>>>>
>>>>> Suppose the current iterate is xk and the current update is dxk. The line
>>>>> search evaluates the augmented Lagrangian now at (xk + dxk). This causes
>>>>> that the value (xk + dxk) is copied in the current solution. If the
>>>>> point (xk + dxk) is rejected, the line search should
>>>>> try the point (xk + alpha * dxk), where alpha < 1. But due to the
>>>>> copying, what happens is that the point ((xk + dxk) + alpha * dxk) is
>>>>> evaluated, see, e.g.,
>>>>> https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html
>>>>> line 191.
>>>>>
>>>>> Best regards
>>>>> Stephan Köhler
>>>>>
>>>>> --
>>>>> Stephan Köhler
>>>>> TU Bergakademie Freiberg
>>>>> Institut für numerische Mathematik und Optimierung
>>>>>
>>>>> Akademiestraße 6
>>>>> 09599 Freiberg
>>>>> Gebäudeteil Mittelbau, Zimmer 2.07
>>>>>
>>>>> Telefon: +49 (0)3731 39-3173 (Büro)
>>>>>
>>>>> <OpenPGP_0xC9BF2C20DFE9F713.asc>
>>>
>>> --
>>> Stephan Köhler
>>> TU Bergakademie Freiberg
>>> Institut für numerische Mathematik und Optimierung
>>>
>>> Akademiestraße 6
>>> 09599 Freiberg
>>> Gebäudeteil Mittelbau, Zimmer 2.07
>>>
>>> Telefon: +49 (0)3731 39-3173 (Büro)
>>> <Minimal_example_without_vtk_2.tar.gz><OpenPGP_0xC9BF2C20DFE9F713.asc>
>>
>
> --
> Stephan Köhler
> TU Bergakademie Freiberg
> Institut für numerische Mathematik und Optimierung
>
> Akademiestraße 6
> 09599 Freiberg
> Gebäudeteil Mittelbau, Zimmer 2.07
>
> Telefon: +49 (0)3731 39-3173 (Büro)
> <run_test_tao_neohooke.sh><test_tao_neohooke.cpp><OpenPGP_0xC9BF2C20DFE9F713.asc>