On Oct 8, 2013, at 5:46 PM, Pierre Jolivet <[email protected]> wrote:
> Please find the log for BoomerAMG, ML and GAMG attached. The set up for > GAMG doesn't look so bad compared to the other packages, so I'm wondering > what is going on with those ? They all have serious problems. It just happens that GAMG's problems are not as bad as the others. > >> >> We need the output from running with -log_summary -pc_mg_log >> >> Also you can run with PETSc's AMG called GAMG (run with -pc_type gamg) >> This will give the most useful information about where it is spending >> the time. >> >> >> Barry >> >> >> On Oct 8, 2013, at 4:11 PM, Pierre Jolivet <[email protected]> wrote: >> >>> Dear all, >>> I'm trying to compare linear solvers for a simple Poisson equation in >>> 3D. >>> I thought that MG was the way to go, but looking at my log, the >>> performance looks abysmal (I know that the matrices are way too small >>> but >>> if I go bigger, it just never performs a single iteration ..). Even >>> though >>> this is neither the BoomerAMG nor the ML mailing list, could you please >>> tell me if PETSc sets some default flags that make the setup for those >>> solvers so slow for this simple problem ? The performance of (G)ASM is >>> in >>> comparison much better. >>> >>> Thanks in advance for your help. >>> >>> PS: first the BoomerAMG log, then ML (much more verbose, sorry). >>> >>> 0 KSP Residual norm 1.599647112604e+00 >>> 1 KSP Residual norm 5.450838232404e-02 >>> 2 KSP Residual norm 3.549673478318e-03 >>> 3 KSP Residual norm 2.901826808841e-04 >>> 4 KSP Residual norm 2.574235778729e-05 >>> 5 KSP Residual norm 2.253410171682e-06 >>> 6 KSP Residual norm 1.871067784877e-07 >>> 7 KSP Residual norm 1.681162800670e-08 >>> 8 KSP Residual norm 2.120841512414e-09 >>> KSP Object: 2048 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=200, initial guess is zero >>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 2048 MPI processes >>> type: hypre >>> HYPRE BoomerAMG preconditioning >>> HYPRE BoomerAMG: Cycle type V >>> HYPRE BoomerAMG: Maximum number of levels 25 >>> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 >>> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 >>> HYPRE BoomerAMG: Threshold for strong coupling 0.25 >>> HYPRE BoomerAMG: Interpolation truncation factor 0 >>> HYPRE BoomerAMG: Interpolation: max elements per row 0 >>> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 >>> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 >>> HYPRE BoomerAMG: Maximum row sums 0.9 >>> HYPRE BoomerAMG: Sweeps down 1 >>> HYPRE BoomerAMG: Sweeps up 1 >>> HYPRE BoomerAMG: Sweeps on coarse 1 >>> HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi >>> HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi >>> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination >>> HYPRE BoomerAMG: Relax weight (all) 1 >>> HYPRE BoomerAMG: Outer relax weight (all) 1 >>> HYPRE BoomerAMG: Using CF-relaxation >>> HYPRE BoomerAMG: Measure type local >>> HYPRE BoomerAMG: Coarsen type Falgout >>> HYPRE BoomerAMG: Interpolation type classical >>> linear system matrix = precond matrix: >>> Matrix Object: 2048 MPI processes >>> type: mpiaij >>> rows=4173281, cols=4173281 >>> total: nonzeros=102576661, allocated nonzeros=102576661 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> --- system solved with PETSc (in 1.005199e+02 seconds) >>> >>> 0 KSP Residual norm 2.368804472986e-01 >>> 1 KSP Residual norm 5.676430019132e-02 >>> 2 KSP Residual norm 1.898005876002e-02 >>> 3 KSP Residual norm 6.193922902926e-03 >>> 4 KSP Residual norm 2.008448794493e-03 >>> 5 KSP Residual norm 6.390465670228e-04 >>> 6 KSP Residual norm 2.157709394389e-04 >>> 7 KSP Residual norm 7.295973819979e-05 >>> 8 KSP Residual norm 2.358343271482e-05 >>> 9 KSP Residual norm 7.489696222066e-06 >>> 10 KSP Residual norm 2.390946857593e-06 >>> 11 KSP Residual norm 8.068086385140e-07 >>> 12 KSP Residual norm 2.706607789749e-07 >>> 13 KSP Residual norm 8.636910863376e-08 >>> 14 KSP Residual norm 2.761981175852e-08 >>> 15 KSP Residual norm 8.755459874369e-09 >>> 16 KSP Residual norm 2.708848598341e-09 >>> 17 KSP Residual norm 8.968748876265e-10 >>> KSP Object: 2048 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=200, initial guess is zero >>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 2048 MPI processes >>> type: ml >>> MG: type is MULTIPLICATIVE, levels=3 cycles=v >>> Cycles per PCApply=1 >>> Using Galerkin computed coarse grid matrices >>> Coarse grid solver -- level ------------------------------- >>> KSP Object: (mg_coarse_) 2048 MPI processes >>> type: preonly >>> maximum iterations=1, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_) 2048 MPI processes >>> type: redundant >>> Redundant preconditioner: First (color=0) of 2048 PCs follows >>> KSP Object: (mg_coarse_redundant_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using NONE norm type for convergence test >>> PC Object: (mg_coarse_redundant_) 1 MPI processes >>> type: lu >>> LU: out-of-place factorization >>> tolerance for zero pivot 2.22045e-14 >>> using diagonal shift on blocks to prevent zero pivot >>> matrix ordering: nd >>> factor fill ratio given 5, needed 4.38504 >>> Factored matrix follows: >>> Matrix Object: 1 MPI processes >>> type: seqaij >>> rows=2055, cols=2055 >>> package used to perform factorization: petsc >>> total: nonzeros=2476747, allocated nonzeros=2476747 >>> total number of mallocs used during MatSetValues calls =0 >>> using I-node routines: found 1638 nodes, limit used is >>> 5 >>> linear system matrix = precond matrix: >>> Matrix Object: 1 MPI processes >>> type: seqaij >>> rows=2055, cols=2055 >>> total: nonzeros=564817, allocated nonzeros=1093260 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node routines >>> linear system matrix = precond matrix: >>> Matrix Object: 2048 MPI processes >>> type: mpiaij >>> rows=2055, cols=2055 >>> total: nonzeros=564817, allocated nonzeros=564817 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> Down solver (pre-smoother) on level 1 ------------------------------- >>> KSP Object: (mg_levels_1_) 2048 MPI processes >>> type: richardson >>> Richardson: damping factor=1 >>> maximum iterations=2 >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using nonzero initial guess >>> using NONE norm type for convergence test >>> PC Object: (mg_levels_1_) 2048 MPI processes >>> type: sor >>> SOR: type = local_symmetric, iterations = 1, local iterations = >>> 1, >>> omega = 1 >>> linear system matrix = precond matrix: >>> Matrix Object: 2048 MPI processes >>> type: mpiaij >>> rows=30194, cols=30194 >>> total: nonzeros=3368414, allocated nonzeros=3368414 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> Up solver (post-smoother) same as down solver (pre-smoother) >>> Down solver (pre-smoother) on level 2 ------------------------------- >>> KSP Object: (mg_levels_2_) 2048 MPI processes >>> type: richardson >>> Richardson: damping factor=1 >>> maximum iterations=2 >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> using nonzero initial guess >>> using NONE norm type for convergence test >>> PC Object: (mg_levels_2_) 2048 MPI processes >>> type: sor >>> SOR: type = local_symmetric, iterations = 1, local iterations = >>> 1, >>> omega = 1 >>> linear system matrix = precond matrix: >>> Matrix Object: 2048 MPI processes >>> type: mpiaij >>> rows=531441, cols=531441 >>> total: nonzeros=12476324, allocated nonzeros=12476324 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> Up solver (post-smoother) same as down solver (pre-smoother) >>> linear system matrix = precond matrix: >>> Matrix Object: 2048 MPI processes >>> type: mpiaij >>> rows=531441, cols=531441 >>> total: nonzeros=12476324, allocated nonzeros=12476324 >>> total number of mallocs used during MatSetValues calls =0 >>> not using I-node (on process 0) routines >>> --- system solved with PETSc (in 2.407844e+02 seconds) >>> >>> >> >> > <log-GAMG><log-ML><log-BoomerAMG>
