Re: [petsc-users] Convergence Error Debugging with KSP solvers in SNES

Karl Yang Thu, 08 Oct 2020 22:54:14 -0700

Hi, Barry,

Thanks for your reply. Yes, I should have used fgmres. But after switching to 
fgmres I'm still facing the same convergence issue.
Seems like the reason is due to DIVERGED_PC_FAILED. But I simply used FD 
jacobian, and fieldsplitPC. I am a bit lost on whether I made some mistakes 
somewhere in the FormFunction or I did not setup the solver correctly.
///////code///////
SNESSetFunction(snes, r, FormFunctionStatic, this);
// SNESSetJacobian(snes, J, J, FormJacobianStatic, this);
SNESSetJacobian(snes, J, J, SNESComputeJacobianDefault, this);
SNESMonitorSet(snes, MySNESMonitor, NULL, NULL);


SNESGetKSP(snes, &ksp);
KSPGetPC(ksp, &pc);
PCSetType(pc, PCFIELDSPLIT);
PCFieldSplitSetDetectSaddlePoint(pc, PETSC_TRUE);
PCFieldSplitSetSchurPre(pc, PC_FIELDSPLIT_SCHUR_PRE_SELF, NULL);
KSPMonitorSet(ksp, MyKSPMonitor, NULL, 0);
KSPSetTolerances(ksp, 1e-8, PETSC_DEFAULT, PETSC_DEFAULT, PETSC_DEFAULT);
SNESSetFromOptions(snes);
//////end/////////

Output from SNES/KSP solver
################# step 1 #################
iter = 0, SNES Function norm 0.0430713
iteration 0 KSP Residual norm 4.307133784528e-02
0 KSP unpreconditioned resid norm 4.307133784528e-02 true resid norm 
4.307133784528e-02 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 4.451434065870e-07
1 KSP unpreconditioned resid norm 4.451434065870e-07 true resid norm 
4.451434065902e-07 ||r(i)||/||b|| 1.033502623460e-05
iteration 2 KSP Residual norm 1.079756105012e-12
2 KSP unpreconditioned resid norm 1.079756105012e-12 true resid norm 
1.079754870815e-12 ||r(i)||/||b|| 2.506898844643e-11
Linear solve converged due to CONVERGED_RTOL iterations 2
iter = 1, SNES Function norm 2.40846e-05
iteration 0 KSP Residual norm 2.408462930023e-05
0 KSP unpreconditioned resid norm 2.408462930023e-05 true resid norm 
2.408462930023e-05 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 1.096958085415e-11
1 KSP unpreconditioned resid norm 1.096958085415e-11 true resid norm 
1.096958085425e-11 ||r(i)||/||b|| 4.554598170270e-07
iteration 2 KSP Residual norm 5.909523288165e-16
2 KSP unpreconditioned resid norm 5.909523288165e-16 true resid norm 
5.909519599233e-16 ||r(i)||/||b|| 2.453647729249e-11
Linear solve converged due to CONVERGED_RTOL iterations 2
iter = 2, SNES Function norm 1.19684e-14
################# step 2 #################
iter = 0, SNES Function norm 0.00391662
iteration 0 KSP Residual norm 3.916615614134e-03
0 KSP unpreconditioned resid norm 3.916615614134e-03 true resid norm 
3.916615614134e-03 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 4.068800385009e-08
1 KSP unpreconditioned resid norm 4.068800385009e-08 true resid norm 
4.068800384986e-08 ||r(i)||/||b|| 1.038856192653e-05
iteration 2 KSP Residual norm 8.427513055511e-14
2 KSP unpreconditioned resid norm 8.427513055511e-14 true resid norm 
8.427497502034e-14 ||r(i)||/||b|| 2.151729537007e-11
Linear solve converged due to CONVERGED_RTOL iterations 2
iter = 1, SNES Function norm 1.99152e-07
iteration 0 KSP Residual norm 1.991523558528e-07
0 KSP unpreconditioned resid norm 1.991523558528e-07 true resid norm 
1.991523558528e-07 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 1.413505562549e-13
1 KSP unpreconditioned resid norm 1.413505562549e-13 true resid norm 
1.413505562550e-13 ||r(i)||/||b|| 7.097609046588e-07
iteration 2 KSP Residual norm 5.165934822520e-18
2 KSP unpreconditioned resid norm 5.165934822520e-18 true resid norm 
5.165932973227e-18 ||r(i)||/||b|| 2.593960262787e-11
Linear solve converged due to CONVERGED_RTOL iterations 2
iter = 2, SNES Function norm 1.69561e-16
################# step 3 #################
iter = 0, SNES Function norm 0.00035615
iteration 0 KSP Residual norm 3.561504844171e-04
0 KSP unpreconditioned resid norm 3.561504844171e-04 true resid norm 
3.561504844171e-04 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 3.701591890269e-09
1 KSP unpreconditioned resid norm 3.701591890269e-09 true resid norm 
3.701591890274e-09 ||r(i)||/||b|| 1.039333667153e-05
iteration 2 KSP Residual norm 7.832821034843e-15
2 KSP unpreconditioned resid norm 7.832821034843e-15 true resid norm 
7.832856926692e-15 ||r(i)||/||b|| 2.199311041093e-11
Linear solve converged due to CONVERGED_RTOL iterations 2
iter = 1, SNES Function norm 1.64671e-09
iteration 0 KSP Residual norm 1.646709543241e-09
0 KSP unpreconditioned resid norm 1.646709543241e-09 true resid norm 
1.646709543241e-09 ||r(i)||/||b|| 1.000000000000e+00
iteration 1 KSP Residual norm 1.043230469512e-15
1 KSP unpreconditioned resid norm 1.043230469512e-15 true resid norm 
1.043230469512e-15 ||r(i)||/||b|| 6.335242749968e-07
iteration 1 KSP Residual norm 0.000000000000e+00
1 KSP unpreconditioned resid norm 0.000000000000e+00 true resid norm -nan 
||r(i)||/||b|| -nan
Linear solve did not converge due to DIVERGED_PC_FAILED iterations 1
PC_FAILED due to SUBPC_ERROR

More information from -ksp_error_if_not_converged -info
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 
3.303168180659e-07 is less than relative tolerance 1.000000000000e-05 times 
initial right hand side norm 7.795816360977e-02 at iteration 12
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] KSPConvergedDefault(): Linear solver has converged. Residual norm 
2.227610512466e+00 is less than relative tolerance 1.000000000000e-05 times 
initial right hand side norm 5.453050347652e+05 at iteration 12
[0] PCSetUp(): Leaving PC with identical preconditioner since operator is 
unchanged
[0] KSPConvergedDefault(): Linear solver is diverging. Initial right hand size 
norm 9.501675075823e-01, current residual norm 4.894880836662e+04 at iteration 
210
[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR:
[0]PETSC ERROR: KSPSolve has not converged, reason DIVERGED_DTOL
[0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html for 
trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.12.5, Mar, 29, 2020
[0]PETSC ERROR: ./stokeTutorial on a arch-linux2-c-debug named a2aa8f1c96aa by 
Unknown Fri Oct 9 13:43:28 2020
[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ 
--with-fc=gfortran --download-mpich --download-fblaslapack --with-cuda
[0]PETSC ERROR: #1 KSPSolve() line 832 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #2 PCApply_FieldSplit_Schur() line 1189 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/pc/impls/fieldsplit/fieldsplit.c
[0]PETSC ERROR: #3 PCApply() line 444 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #4 KSP_PCApply() line 281 in 
/usr/local/petsc/petsc-3.12.5/include/petsc/private/kspimpl.h
[0]PETSC ERROR: #5 KSPFGMRESCycle() line 166 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
[0]PETSC ERROR: #6 KSPSolve_FGMRES() line 291 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/ksp/impls/gmres/fgmres/fgmres.c
[0]PETSC ERROR: #7 KSPSolve() line 760 in 
/usr/local/petsc/petsc-3.12.5/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #8 SNESSolve_NEWTONLS() line 225 in 
/usr/local/petsc/petsc-3.12.5/src/snes/impls/ls/ls.c
[0]PETSC ERROR: #9 SNESSolve() line 4482 in 
/usr/local/petsc/petsc-3.12.5/src/snes/interface/snes.c
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 
-2080374783
SNES Object: 1 MPI processes
type: newtonls
maximum iterations=50, maximum function evaluations=10000
tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
total number of linear solver iterations=2
total number of function evaluations=1322
norm schedule ALWAYS
Jacobian is built using finite differences one column at a time
SNESLineSearch Object: 1 MPI processes
type: bt
interpolation: cubic
alpha=1.000000e-04
maxstep=1.000000e+08, minlambda=1.000000e-12
tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
maximum iterations=40
KSP Object: 1 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no 
iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: fieldsplit
FieldSplit with Schur preconditioner, blocksize = 1, factorization FULL
Preconditioner for the Schur complement formed from S itself
Split info:
Split number 0 Defined by IS
Split number 1 Defined by IS
KSP solver for A00 block
KSP Object: (fieldsplit_0_) 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no 
iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (fieldsplit_0_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=512, cols=512
package used to perform factorization: petsc
total: nonzeros=9213, allocated nonzeros=9213
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=512, cols=512
total: nonzeros=9213, allocated nonzeros=9213
total number of mallocs used during MatSetValues calls=0
not using I-node routines
KSP solver for S = A11 - A10 inv(A00) A01
KSP Object: (fieldsplit_1_) 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no 
iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (fieldsplit_1_) 1 MPI processes
type: none
linear system matrix = precond matrix:
Mat Object: (fieldsplit_1_) 1 MPI processes
type: schurcomplement
rows=147, cols=147
Schur complement A11 - A10 inv(A00) A01
A11
Mat Object: 1 MPI processes
type: seqaij
rows=147, cols=147
total: nonzeros=147, allocated nonzeros=147
total number of mallocs used during MatSetValues calls=0
not using I-node routines
A10
Mat Object: 1 MPI processes
type: seqaij
rows=147, cols=512
total: nonzeros=2560, allocated nonzeros=2560
total number of mallocs used during MatSetValues calls=0
using I-node routines: found 87 nodes, limit used is 5
KSP of A00
KSP Object: (fieldsplit_0_) 1 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no 
iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (fieldsplit_0_) 1 MPI processes
type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=512, cols=512
package used to perform factorization: petsc
total: nonzeros=9213, allocated nonzeros=9213
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=512, cols=512
total: nonzeros=9213, allocated nonzeros=9213
total number of mallocs used during MatSetValues calls=0
not using I-node routines
A01
Mat Object: 1 MPI processes
type: seqaij
rows=512, cols=147
total: nonzeros=2562, allocated nonzeros=2562
total number of mallocs used during MatSetValues calls=0
not using I-node routines
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=659, cols=659
total: nonzeros=14482, allocated nonzeros=27543
total number of mallocs used during MatSetValues calls=1309
not using I-node routines

On Oct 9 2020, at 2:17 am, Barry Smith <[email protected]> wrote:
>
> When you get a huge change at restart this means something is seriously wrong 
> with either the linear operator or the linear preconditioner.
>
> How are you doing the matrix vector product? Note both the operator and 
> preconditioner must be linear operators for GMRES.
>
> FGMRES allows the preconditioner to be nonlinear. You can try
>
> -ksp_type fgmres -ksp_monitor_true_residual
>
> Barry
>
>
> > On Oct 8, 2020, at 2:43 AM, Yang Juntao <[email protected] 
> > (https://link.getmailspring.com/link/[email protected]/0?redirect=mailto%3AY.Juntao%40hotmail.com&recipient=cGV0c2MtdXNlcnNAbWNzLmFubC5nb3Y%3D)>
> >  wrote:
> > Hello,
> >
> > I’m working on a nonlinear solver with SNES with handcoded jacobian and 
> > function. Each linear solver is solved with KSP solver.
> > But sometimes I got issues with ksp solver convergence. I tried with finite 
> > difference approximated jacobian, but get the same error.
> >
> > From the iterations, the convergence seems ok at the beginning but suddenly 
> > diverged in the last iteration.
> > Hope anyone with experience on ksp solvers could direct me to a direction I 
> > can debug the problem.
> >
> > iter = 0, SNES Function norm 2.94934e-06
> > iteration 0 KSP Residual norm 1.094600281831e-06
> > iteration 1 KSP Residual norm 1.264284474186e-08
> > iteration 2 KSP Residual norm 6.593269221816e-09
> > iteration 3 KSP Residual norm 1.689570779457e-09
> > iteration 4 KSP Residual norm 1.040661505932e-09
> > iteration 5 KSP Residual norm 5.422761817348e-10
> > iteration 6 KSP Residual norm 2.492867371369e-10
> > iteration 7 KSP Residual norm 8.261522376775e-11
> > iteration 8 KSP Residual norm 4.246401544245e-11
> > iteration 9 KSP Residual norm 2.514366787388e-11
> > iteration 10 KSP Residual norm 1.982940267051e-11
> > iteration 11 KSP Residual norm 1.586470414676e-11
> > iteration 12 KSP Residual norm 9.866392216207e-12
> > iteration 13 KSP Residual norm 4.951342176999e-12
> > iteration 14 KSP Residual norm 2.418292660318e-12
> > iteration 15 KSP Residual norm 1.747418526086e-12
> > iteration 16 KSP Residual norm 1.094150535809e-12
> > iteration 17 KSP Residual norm 4.464287492066e-13
> > iteration 18 KSP Residual norm 3.530090494462e-13
> > iteration 19 KSP Residual norm 2.825698091454e-13
> > iteration 20 KSP Residual norm 1.950568425807e-13
> > iteration 21 KSP Residual norm 1.227898091813e-13
> > iteration 22 KSP Residual norm 5.411106347374e-14
> > iteration 23 KSP Residual norm 4.511115848564e-14
> > iteration 24 KSP Residual norm 4.063546606691e-14
> > iteration 25 KSP Residual norm 3.677694771949e-14
> > iteration 26 KSP Residual norm 3.459244943466e-14
> > iteration 27 KSP Residual norm 3.263954971093e-14
> > iteration 28 KSP Residual norm 3.087344619079e-14
> > iteration 29 KSP Residual norm 2.809426925625e-14
> > iteration 30 KSP Residual norm 4.366149884754e-01
> > Linear solve did not converge due to DIVERGED_DTOL iterations 30
> >
> >
> >
> > SNES Object: 1 MPI processes
> > type: newtonls
> > SNES has not been set up so information may be incomplete
> > maximum iterations=50, maximum function evaluations=10000
> > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
> > total number of linear solver iterations=0
> > total number of function evaluations=0
> > norm schedule ALWAYS
> > SNESLineSearch Object: 1 MPI processes
> > type: bt
> > interpolation: cubic
> > alpha=1.000000e-04
> > maxstep=1.000000e+08, minlambda=1.000000e-12
> > tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
> > lambda=1.000000e-08
> > maximum iterations=40
> > KSP Object: 1 MPI processes
> > type: gmres
> > restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization 
> > with no iterative refinement
> > happy breakdown tolerance 1e-30
> > maximum iterations=10000, initial guess is zero
> > tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
> > left preconditioning
> > using DEFAULT norm type for convergence test
> > PC Object: 1 MPI processes
> > type: fieldsplit
> > PC has not been set up so information may be incomplete
> > FieldSplit with Schur preconditioner, factorization FULL
> > Preconditioner for the Schur complement formed from S itself
> > Split info:
> > KSP solver for A00 block
> > not yet available
> > KSP solver for S = A11 - A10 inv(A00) A01
> > not yet available
> > linear system matrix = precond matrix:
> > Mat Object: 1 MPI processes
> > type: seqaij
> > rows=659, cols=659
> > total: nonzeros=659, allocated nonzeros=7908
> > total number of mallocs used during MatSetValues calls=0
> > not using I-node routines
> >
> > Regards
> > Juntao
> >
> >
>
>
>

Re: [petsc-users] Convergence Error Debugging with KSP solvers in SNES

Reply via email to