[petsc-users] interpreting data from SNESSolve profiling

Matteo Semplice Wed, 08 Feb 2023 04:57:07 -0800

Dear all,

I am trying to optimize the nonlinear solvers in a code of mine,but I am having a hard time at interpreting the profiling data from theSNES. In particular, if I run with -snesCorr_snes_lag_jacobian 5-snesCorr_snes_linesearch_monitor -snesCorr_snes_monitor-snesCorr_snes_linesearch_type basic -snesCorr_snes_view I get, for alltimesteps an output like


0 SNES Function norm 2.204257292307e+00
 1 SNES Function norm 5.156376709750e-03
 2 SNES Function norm 9.399026338316e-05
 3 SNES Function norm 1.700505246874e-06
 4 SNES Function norm 2.938127043559e-08
SNES Object: snesCorr (snesCorr_) 1 MPI process
 type: newtonls
 maximum iterations=50, maximum function evaluations=10000
 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
 total number of linear solver iterations=4
 total number of function evaluations=5
 norm schedule ALWAYS
 Jacobian is rebuilt every 5 SNES iterations
 SNESLineSearch Object: (snesCorr_) 1 MPI process
   type: basic
   maxstep=1.000000e+08, minlambda=1.000000e-12

tolerances: relative=1.000000e-08, absolute=1.000000e-15,lambda=1.000000e-08

   maximum iterations=40
 KSP Object: (snesCorr_) 1 MPI process
   type: gmres

restart=30, using Classical (unmodified) Gram-SchmidtOrthogonalization with no iterative refinement

     happy breakdown tolerance 1e-30
   maximum iterations=10000, initial guess is zero
   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
   left preconditioning
   using PRECONDITIONED norm type for convergence test
 PC Object: (snesCorr_) 1 MPI process
   type: ilu
     out-of-place factorization
     0 levels of fill
     tolerance for zero pivot 2.22045e-14
     matrix ordering: natural
     factor fill ratio given 1., needed 1.
       Factored matrix follows:
         Mat Object: (snesCorr_) 1 MPI process
           type: seqaij
           rows=1200, cols=1200
           package used to perform factorization: petsc
           total: nonzeros=17946, allocated nonzeros=17946
             using I-node routines: found 400 nodes, limit used is 5
   linear system matrix = precond matrix:
   Mat Object: 1 MPI process
     type: seqaij
     rows=1200, cols=1200
     total: nonzeros=17946, allocated nonzeros=17946
     total number of mallocs used during MatSetValues calls=0
       using I-node routines: found 400 nodes, limit used is 5

I guess that this means that no linesearch is performed and the fullNewton step is always performed (I did not report the full output, butall timesteps are alike). Also, with the default (bt) LineSearch, thetotal CPU time does not change, which seems in line with this.

However, I'd have expected that the time spent in SNESLineSearch wouldbe negligible, but the flamegraph is showing that about 38% of the timespent by SNESSolve is actually spent in SNESLineSearch. Furthermore,SNESLineSearch seems to cause more SNESFunction evaluations (in terms ofCPU time) than the SNESSolve itself. The flamegraph is attached.

Could some expert help me in understanding these data? Is the LineSearchactually performing the newton step? Given that the full step is alwaystaken, can the SNESFunction evaluations from the LineSearch be skipped?


Thanks a lot!

Matteo

[petsc-users] interpreting data from SNESSolve profiling

Reply via email to