Thanks, Mark and Barry, It works pretty wells in terms of the number of linear iterations (using "-pc_gamg_sym_graph true"), but it is horrible in the compute time. I am using the two-level method via "-pc_mg_levels 2". The reason why the compute time is larger than other preconditioning options is that a matrix free method is used in the fine level and in my particular problem the function evaluation is expensive.
I am using "-snes_mf_operator 1" to turn on the Jacobian-free Newton, but I do not think I want to make the preconditioning part matrix-free. Do you guys know how to turn off the matrix-free method for GAMG? Here is the detailed solver: *SNES Object: 384 MPI processes type: newtonls maximum iterations=200, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-08, solution=1e-50 total number of linear solver iterations=20 total number of function evaluations=166 norm schedule ALWAYS SNESLineSearch Object: 384 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 384 MPI processes type: gmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=100, initial guess is zero tolerances: relative=0.001, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 384 MPI processes type: gamg MG: type is MULTIPLICATIVE, levels=2 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices GAMG specific options Threshold for dropping small values from graph 0. AGG specific options Symmetric graph true Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 384 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 384 MPI processes type: bjacobi block Jacobi: number of blocks = 384 Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (mg_coarse_sub_) 1 MPI processes type: preonly maximum iterations=1, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_sub_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 using diagonal shift on blocks to prevent zero pivot [INBLOCKS] matrix ordering: nd factor fill ratio given 5., needed 1.31367 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=37, cols=37 package used to perform factorization: petsc total: nonzeros=913, allocated nonzeros=913 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=37, cols=37 total: nonzeros=695, allocated nonzeros=695 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 384 MPI processes type: mpiaij rows=18145, cols=18145 total: nonzeros=1709115, allocated nonzeros=1709115 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 384 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0.133339, max = 1.46673 Chebyshev: eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1] KSP Object: (mg_levels_1_esteig_) 384 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10, initial guess is zero tolerances: relative=1e-12, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test maximum iterations=2 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 384 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix followed by preconditioner matrix: Mat Object: 384 MPI processes type: mffd rows=3020875, cols=3020875 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: () 384 MPI processes type: mpiaij rows=3020875, cols=3020875 total: nonzeros=215671710, allocated nonzeros=241731750 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix followed by preconditioner matrix: Mat Object: 384 MPI processes type: mffd rows=3020875, cols=3020875 Matrix-free approximation: err=1.49012e-08 (relative error in function evaluation) Using wp compute h routine Does not compute normU Mat Object: () 384 MPI processes type: mpiaij rows=3020875, cols=3020875 total: nonzeros=215671710, allocated nonzeros=241731750 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines* Fande, On Thu, Apr 6, 2017 at 8:27 AM, Mark Adams <mfad...@lbl.gov> wrote: > On Tue, Apr 4, 2017 at 10:10 AM, Barry Smith <bsm...@mcs.anl.gov> wrote: > > > >> Does this mean that GAMG works for the symmetrical matrix only? > > > > No, it means that for non symmetric nonzero structure you need the > extra flag. So use the extra flag. The reason we don't always use the flag > is because it adds extra cost and isn't needed if the matrix already has a > symmetric nonzero structure. > > BTW, if you have symmetric non-zero structure you can just set > -pc_gamg_threshold -1.0', note the "or" in the message. > > If you want to mess with the threshold then you need to use the > symmetrized flag. >