Dear All,
I want to create two grid preconditioner for the linear Jacobian solve for the
nonlinear problem. I am trying to use the inexact Newton as the nonlinear
solver, and the fGMRES as the linear solve. For the preconditioner for the
linear solve, I want to create a two level ASM preconditioner by using PCMG.
I use the additive Schwarz (ASM) preconditioned Richardson iteration as the
smoother on the fine grid. And use the ASM preconditioned GMRES as the coarse
solve.
But I have the follow questions in setting up the preconditioer,
1. When I use the command PCMGGetSmoother to setup the fine grid smoother, it
only effective on the down solver but not the up solver. Although I want to use
same smoother for up and down procedure, I need to call PCMGGetSmootherUp and
PCMGGetSmootherDown to setup the smoother seperately. Do you have any ideas
about this issue?
2. In my two level preconditioner, I need three ASM preconditioner, two for the
up and down smoother, and one for the coarse solve. If I want to use LU
factorization as the subdomain solve for both the up and down smoother, but I
don't want to redo the LU factorization for the up smoother. How can I keep the
LU factorization for the down smoother in order to use for the up smoother?
3. I run my code with the my twolevel preconditioner. The SNES converges in two
iteration.
0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last norm
0.000000e+00.
1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last norm
2.268574e-06.
2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last norm
1.312605e-12.
In the log summary, I check the number of the factorization in the problem
MatLUFactorSym 4 1.0 1.1232e+00 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+01 2 0 0 0 1 2 0 0 0 1 0
MatLUFactorNum 4 1.0 1.3245e+01 2.2 1.22e+09 3.0 0.0e+00 0.0e+00
0.0e+00 28 86 0 0 0 30 86 0 0 0 553
Do you have any ideas that why the number of MatLUFactorSym is 4? And is there
any approach that I can find out how much LU factorization are done for the
coarse solve, and how much LU factorization are done for the up and down
smoother? I believe the number should be six, in each SNES iteration, I have
need to do two LU for the up and down smoother, and one LU for the coarse
solve. Because we have two SNES iteration now, the number of LU factorization
should be 6 instead of 4.
Thank you so much for your help. Below is the output form the SNESView of my
program, and the setup of my preconditioner.
Best
Yuqi Wu
/* SNES view */
SNES Object: 8 MPI processes
type: ls
line search variant: SNESLineSearchCubic
alpha=1.000000000000e-04, maxstep=1.000000000000e+08,
minlambda=1.000000000000e-12
maximum iterations=10, maximum function evaluations=10000
tolerances: relative=1e-07, absolute=1e-50, solution=1e-08
total number of linear solver iterations=9
total number of function evaluations=3
KSP Object: 8 MPI processes
type: fgmres
GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1000, initial guess is zero
tolerances: relative=1e-06, absolute=1e-14, divergence=10000
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
type: mg
MG: type is MULTIPLICATIVE, levels=2 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (coarse_) 8 MPI processes
type: gmres
GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1000, initial guess is zero
tolerances: relative=0.001, absolute=1e-50, divergence=10000
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: (coarse_) 8 MPI processes
type: asm
Additive Schwarz: total subdomain blocks = 8, user-defined overlap
Additive Schwarz: restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC
objects:
KSP Object: (coarse_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (coarse_sub_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 3.73447
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=840, cols=840
package used to perform factorization: petsc
total: nonzeros=384486, allocated nonzeros=384486
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 349 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=840, cols=840
total: nonzeros=102956, allocated nonzeros=102956
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 360 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 8 MPI processes
type: mpiaij
rows=4186, cols=4186
total: nonzeros=656174, allocated nonzeros=656174
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 315 nodes, limit used
is 5
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 8 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (fine_) 8 MPI processes
type: asm
Additive Schwarz: total subdomain blocks = 8, user-defined overlap
Additive Schwarz: restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC
objects:
KSP Object: (fine_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (fine_sub_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 5.49559
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=2100, cols=2100
package used to perform factorization: petsc
total: nonzeros=401008, allocated nonzeros=401008
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1491 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=2100, cols=2100
total: nonzeros=72969, allocated nonzeros=72969
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1532 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 8 MPI processes
type: mpiaij
rows=11585, cols=11585
total: nonzeros=458097, allocated nonzeros=3026770
total number of mallocs used during MatSetValues calls =978
using I-node (on process 0) routines: found 1365 nodes, limit used
is 5
Up solver (post-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 8 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (fine_) 8 MPI processes
type: asm
Additive Schwarz: total subdomain blocks = 8, user-defined overlap
Additive Schwarz: restriction/interpolation type - RESTRICT
Local solve is same for all blocks, in the following KSP and PC
objects:
KSP Object: (fine_sub_) 1 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (fine_sub_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 5.49559
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=2100, cols=2100
package used to perform factorization: petsc
total: nonzeros=401008, allocated nonzeros=401008
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1491 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=2100, cols=2100
total: nonzeros=72969, allocated nonzeros=72969
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1532 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 8 MPI processes
type: mpiaij
rows=11585, cols=11585
total: nonzeros=458097, allocated nonzeros=3026770
total number of mallocs used during MatSetValues calls =978
using I-node (on process 0) routines: found 1365 nodes, limit used
is 5
linear system matrix = precond matrix:
Matrix Object: 8 MPI processes
type: mpiaij
rows=11585, cols=11585
total: nonzeros=458097, allocated nonzeros=3026770
total number of mallocs used during MatSetValues calls =978
using I-node (on process 0) routines: found 1365 nodes, limit used is 5
SNES converged: CONVERGED_FNORM_RELATIVE.
//*******************************
Below setup of my preconditioner,
/* set up the MG preconditioner */
ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr);
ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr);
ierr = PCSetType(finepc,PCMG);CHKERRQ(ierr);
ierr = PCMGSetType(finepc,PC_MG_MULTIPLICATIVE);CHKERRQ(ierr);
ierr = PCMGSetLevels(finepc,2,PETSC_NULL);CHKERRQ(ierr);
ierr = PCMGSetCycleType(finepc,PC_MG_CYCLE_V);CHKERRQ(ierr);
ierr = PCMGSetNumberSmoothUp(finepc,1);CHKERRQ(ierr);
ierr = PCMGSetNumberSmoothDown(finepc,1);CHKERRQ(ierr);
ierr = PCMGSetGalerkin(finepc,PETSC_TRUE);CHKERRQ(ierr);
ierr = PCMGSetResidual(finepc,1,PCMGDefaultResidual,algebra->J);CHKERRQ(ierr);
ierr = PCMGSetInterpolation(finepc,1,ctx->Interp);CHKERRQ(ierr);
/* set up the coarse solve */
ierr = PCMGGetCoarseSolve(finepc,&ctx->coarseksp);CHKERRQ(ierr);
ierr = KSPSetOptionsPrefix(ctx->coarseksp,"coarse_");CHKERRQ(ierr);
ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr);
/* set up the fine grid smoother */
ierr = PCMGGetSmoother(finepc,1,&kspsmooth);CHKERRQ(ierr);
ierr = KSPSetType(kspsmooth, KSPRICHARDSON);CHKERRQ(ierr);
ierr = KSPGetPC(kspsmooth,&asmpc);CHKERRQ(ierr);
ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr);
ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr);
ierr =
PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm,PETSC_NULL);CHKERRQ(ierr);