Re: [petsc-users] GAMG speed

Barry Smith Thu, 01 Aug 2013 11:49:39 -0700

  Do you use KSPSetDM(ksp,da);  ?  See src/ksp/ksp/examples/tutorials/ex19.c


   Barry

On Aug 1, 2013, at 1:35 PM, Michele Rosso <mro...@uci.edu> wrote:

> Barry,
> 
> I am using a finite difference Cartesian uniform grid and DMDA and so far it 
> has not given me any problem.
> I am using a ksp solver (not snes). In a previous thread, I was told an odd 
> number of grid points was needed for the geometric multigrid, is that correct?
> I tried to run my case with
> 
> 
> -pc_type mg -da_refine 4
> 
> 
> 
> but it does not seem to use the -da_refine option:
> 
> mpiexec   -np 4 ./test  -pc_type mg -da_refine 4  -ksp_view -options_left
> 
> 
> KSP Object: 4 MPI processes
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>  left preconditioning
>  using nonzero initial guess
>  using UNPRECONDITIONED norm type for convergence test
> PC Object: 4 MPI processes
>  type: mg
>    MG: type is MULTIPLICATIVE, levels=1 cycles=v
>      Cycles per PCApply=1
>      Not using Galerkin computed coarse grid matrices
>  Coarse grid solver -- level -------------------------------
>    KSP Object:    (mg_levels_0_)     4 MPI processes
>      type: chebyshev
>        Chebyshev: eigenvalue estimates:  min = 0.134543, max = 1.47998
>        Chebyshev: estimated using:  [0 0.1; 0 1.1]
>        KSP Object:        (mg_levels_0_est_)         4 MPI processes
>          type: gmres
>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
> Orthogonalization with no iterative refinement
>            GMRES: happy breakdown tolerance 1e-30
>          maximum iterations=10, initial guess is zero
>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>          left preconditioning
>          using NONE norm type for convergence test
>        PC Object:        (mg_levels_0_)         4 MPI processes
>          type: sor
>            SOR: type = local_symmetric, iterations = 1, local iterations = 1, 
> omega = 1
>          linear system matrix = precond matrix:
>          Matrix Object:           4 MPI processes
>            type: mpiaij
>            rows=262144, cols=262144
>            total: nonzeros=1835008, allocated nonzeros=1835008
>            total number of mallocs used during MatSetValues calls =0
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>      using NONE norm type for convergence test
>    PC Object:    (mg_levels_0_)     4 MPI processes
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, 
> omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:       4 MPI processes
>        type: mpiaij
>        rows=262144, cols=262144
>        total: nonzeros=1835008, allocated nonzeros=1835008
>        total number of mallocs used during MatSetValues calls =0
>  linear system matrix = precond matrix:
>  Matrix Object:   4 MPI processes
>    type: mpiaij
>    rows=262144, cols=262144
>    total: nonzeros=1835008, allocated nonzeros=1835008
>    total number of mallocs used during MatSetValues calls =0
> Solution       =    1.53600013     sec
> #PETSc Option Table entries:
> -da_refine 4
> -ksp_view
> -options_left
> -pc_type mg
> #End of PETSc Option Table entries
> There is one unused database option. It is:
> Option left: name:-da_refine value: 4
> 
> Michele
> 
> On 08/01/2013 11:21 AM, Barry Smith wrote:
>>    What kind of mesh are you using? Are you using DMDA? If you are using 
>> DMDA (and have written your code to use it "correctly") then it should be 
>> trivial to run with geometric multigrid and geometric multigrid should be a 
>> bit faster.
>> 
>>    For example on src/snes/examples/tutorials/ex19.c   I run with ./ex19 
>> -pc_type mg -da_refine 4 and it refines the original DMDA 4 times and uses 
>> geometric multigrid with 5 levels.
>> 
>> 
>>    Barry
>> 
>> 
>> On Aug 1, 2013, at 1:14 PM, Michele Rosso <mro...@uci.edu> wrote:
>> 
>>> Hi,
>>> 
>>> I am successfully using PETSc (v3.4.2)  to solve a 3D Poisson's equation 
>>> with CG + GAMG as I was suggested to do in a previous thread.
>>> So far I am using GAMG with the default settings, i.e.
>>> 
>>> -pc_type gamg -pc_gamg_agg_nsmooths 1
>>> 
>>> The speed of the solution is satisfactory, but I would like to know if you 
>>> have any suggestions to further speed it up, particularly
>>> if there is any parameters worth looking into to achieve an even faster 
>>> solution, for example number of levels and so on.
>>> So far I am using Dirichlet's BCs for my test case, but I will soon have 
>>> periodic conditions: in this case, does GAMG require particular settings?
>>> Finally, I did not try geometric multigrid: do you think it is worth a shot?
>>> 
>>> Here are my current settings:
>>> 
>>> I run with
>>> 
>>> -pc_type gamg -pc_gamg_agg_nsmooths 1 -ksp_view -options_left
>>> 
>>> and the output is:
>>> 
>>> KSP Object: 4 MPI processes
>>>   type: cg
>>>   maximum iterations=10000
>>>   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>>>   left preconditioning
>>>   using nonzero initial guess
>>>   using UNPRECONDITIONED norm type for convergence test
>>> PC Object: 4 MPI processes
>>>   type: gamg
>>>     MG: type is MULTIPLICATIVE, levels=3 cycles=v
>>>       Cycles per PCApply=1
>>>       Using Galerkin computed coarse grid matrices
>>>   Coarse grid solver -- level -------------------------------
>>>     KSP Object:    (mg_coarse_)     4 MPI processes
>>>       type: preonly
>>>       maximum iterations=1, initial guess is zero
>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>       left preconditioning
>>>       using NONE norm type for convergence test
>>>     PC Object:    (mg_coarse_)     4 MPI processes
>>>       type: bjacobi
>>>         block Jacobi: number of blocks = 4
>>>         Local solve info for each block is in the following KSP and PC 
>>> objects:
>>>       [0] number of local blocks = 1, first local block number = 0
>>>                 [0] local block number 0
>>> KSP Object:          (mg_coarse_sub_)         1 MPI processes
>>>           type: preonly
>>>           maximum iterations=1, initial guess is zero
>>>                 tolerances:  relative=1e-05, absolute=1e-50, 
>>> divergence=10000
>>> KSP Object:        (mg_coarse_sub_)            left preconditioning
>>>           using NONE norm type for convergence test
>>>           PC Object:        (mg_coarse_sub_)       1 MPI processes
>>>           type: preonly
>>>          1 MPI processes
>>>           type: lu
>>>           maximum iterations=1, initial guess is zero
>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>           LU: out-of-place factorization
>>>             left preconditioning
>>>           using NONE norm type for convergence test
>>>           PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>           type: lu
>>>           tolerance for zero pivot 2.22045e-14
>>>             using diagonal shift on blocks to prevent zero pivot
>>>             matrix ordering: nd
>>>             LU: out-of-place factorization
>>>             tolerance for zero pivot 2.22045e-14
>>>             using diagonal shift on blocks to prevent zero pivot
>>>             matrix ordering: nd
>>>             factor fill ratio given 5, needed 0
>>>               Factored matrix follows:
>>>             factor fill ratio given 5, needed 4.13207
>>>               Factored matrix follows:
>>>                   Matrix Object:              Matrix Object:                
>>>  1 MPI processes
>>>                   type: seqaij
>>>                     rows=395, cols=395
>>>                     package used to perform factorization: petsc
>>>                   total: nonzeros=132379, allocated nonzeros=132379
>>>                   total number of mallocs used during MatSetValues calls =0
>>>                         not using I-node routines
>>>            1 MPI processes
>>>                   type: seqaij
>>>           linear system matrix = precond matrix:
>>>                     rows=0, cols=0
>>>                     package used to perform factorization: petsc
>>>                   total: nonzeros=1, allocated nonzeros=1
>>>                     total number of mallocs used during MatSetValues calls 
>>> =0
>>>                       not using I-node routines
>>>               linear system matrix = precond matrix:
>>>   Matrix Object:             1 MPI processes
>>>             type: seqaij
>>>           Matrix Object:KSP Object:           1 MPI processes
>>>             type: seqaij
>>>             rows=0, cols=0
>>>             total: nonzeros=0, allocated nonzeros=0
>>>             total number of mallocs used during MatSetValues calls =0
>>>                 not using I-node routines
>>>           rows=395, cols=395
>>>             total: nonzeros=32037, allocated nonzeros=32037
>>>             total number of mallocs used during MatSetValues calls =0
>>>               not using I-node routines
>>>           - - - - - - - - - - - - - - - - - -
>>>           KSP Object:        (mg_coarse_sub_)         1 MPI processes
>>>           type: preonly
>>>           maximum iterations=1, initial guess is zero
>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>           left preconditioning
>>>           using NONE norm type for convergence test
>>>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>           type: lu
>>>             LU: out-of-place factorization
>>>             tolerance for zero pivot 2.22045e-14
>>>             using diagonal shift on blocks to prevent zero pivot
>>>             matrix ordering: nd
>>>             factor fill ratio given 5, needed 0
>>>               Factored matrix follows:
>>>                 Matrix Object:                 1 MPI processes
>>>                   type: seqaij
>>>                   rows=0, cols=0
>>>                   package used to perform factorization: petsc
>>>                   total: nonzeros=1, allocated nonzeros=1
>>>                   total number of mallocs used during MatSetValues calls =0
>>>                     not using I-node routines
>>>           linear system matrix = precond matrix:
>>>           Matrix Object:           1 MPI processes
>>>             type: seqaij
>>>             rows=0, cols=0
>>>             total: nonzeros=0, allocated nonzeros=0
>>>             total number of mallocs used during MatSetValues calls =0
>>>               not using I-node routines
>>>   (mg_coarse_sub_)         1 MPI processes
>>>           type: preonly
>>>           maximum iterations=1, initial guess is zero
>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>           left preconditioning
>>>           using NONE norm type for convergence test
>>>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>           type: lu
>>>             LU: out-of-place factorization
>>>             tolerance for zero pivot 2.22045e-14
>>>             using diagonal shift on blocks to prevent zero pivot
>>>             matrix ordering: nd
>>>             factor fill ratio given 5, needed 0
>>>               Factored matrix follows:
>>>                 Matrix Object:                 1 MPI processes
>>>                   type: seqaij
>>>                   rows=0, cols=0
>>>                   package used to perform factorization: petsc
>>>                   total: nonzeros=1, allocated nonzeros=1
>>>                   total number of mallocs used during MatSetValues calls =0
>>>                     not using I-node routines
>>>           linear system matrix = precond matrix:
>>>           Matrix Object:           1 MPI processes
>>>             type: seqaij
>>>             rows=0, cols=0
>>>             total: nonzeros=0, allocated nonzeros=0
>>>             total number of mallocs used during MatSetValues calls =0
>>>               not using I-node routines
>>>       [1] number of local blocks = 1, first local block number = 1
>>>         [1] local block number 0
>>>         - - - - - - - - - - - - - - - - - -
>>>       [2] number of local blocks = 1, first local block number = 2
>>>         [2] local block number 0
>>>         - - - - - - - - - - - - - - - - - -
>>>       [3] number of local blocks = 1, first local block number = 3
>>>         [3] local block number 0
>>>         - - - - - - - - - - - - - - - - - -
>>>       linear system matrix = precond matrix:
>>>       Matrix Object:       4 MPI processes
>>>         type: mpiaij
>>>         rows=395, cols=395
>>>         total: nonzeros=32037, allocated nonzeros=32037
>>>         total number of mallocs used during MatSetValues calls =0
>>>           not using I-node (on process 0) routines
>>>   Down solver (pre-smoother) on level 1 -------------------------------
>>>     KSP Object:    (mg_levels_1_)     4 MPI processes
>>>       type: chebyshev
>>>         Chebyshev: eigenvalue estimates:  min = 0.0636225, max = 1.33607
>>>       maximum iterations=2
>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>       left preconditioning
>>>       using nonzero initial guess
>>>       using NONE norm type for convergence test
>>>     PC Object:    (mg_levels_1_)     4 MPI processes
>>>       type: jacobi
>>>       linear system matrix = precond matrix:
>>>       Matrix Object:       4 MPI processes
>>>         type: mpiaij
>>>         rows=23918, cols=23918
>>>         total: nonzeros=818732, allocated nonzeros=818732
>>>         total number of mallocs used during MatSetValues calls =0
>>>           not using I-node (on process 0) routines
>>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>>   Down solver (pre-smoother) on level 2 -------------------------------
>>>     KSP Object:    (mg_levels_2_)     4 MPI processes
>>>       type: chebyshev
>>>         Chebyshev: eigenvalue estimates:  min = 0.0971369, max = 2.03987
>>>       maximum iterations=2
>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>       left preconditioning
>>>       using nonzero initial guess
>>>       using NONE norm type for convergence test
>>>     PC Object:    (mg_levels_2_)     4 MPI processes
>>>       type: jacobi
>>>       linear system matrix = precond matrix:
>>>       Matrix Object:       4 MPI processes
>>>         type: mpiaij
>>>         rows=262144, cols=262144
>>>         total: nonzeros=1835008, allocated nonzeros=1835008
>>>         total number of mallocs used during MatSetValues calls =0
>>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>>   linear system matrix = precond matrix:
>>>   Matrix Object:   4 MPI processes
>>>     type: mpiaij
>>>     rows=262144, cols=262144
>>>     total: nonzeros=1835008, allocated nonzeros=1835008
>>>     total number of mallocs used during MatSetValues calls =0
>>> #PETSc Option Table entries:
>>> -ksp_view
>>> -options_left
>>> -pc_gamg_agg_nsmooths 1
>>> -pc_type gamg
>>> #End of PETSc Option Table entries
>>> There are no unused options.
>>> 
>>> 
>>> Thank you,
>>> Michele
>> 
>

Re: [petsc-users] GAMG speed

Reply via email to