Re: [petsc-users] GAMG speed

Barry Smith Thu, 01 Aug 2013 13:05:49 -0700

   You can use the option -pc_mg_galerkin  and then MG will compute the coarser 
matrices with a sparse matrix matrix matrix product so you should not need to 
change your code to try it out.  You still need to use the KSPSetDM() and 
-da_refine n to get it working


   If it doesn't work, send us all the output.

   Barry


On Aug 1, 2013, at 2:47 PM, Michele Rosso <mro...@uci.edu> wrote:

> Barry,
> you are correct, I did not use it. I think I get now where is the problem. 
> Correct me if I am wrong, but for the
> geometric multigrid to work, ksp must be provided with subroutines to compute 
> the matrix and the rhs at any level through
> KSPSetComputeOperators and KSPSetComputeRHS.
> I do not do that, I simply build a rhs vector and a matrix and then I solve 
> the system.
> If you confirm what I just wrote, I will try to modify my code accordingly 
> and get back to you.
> Thank you,
> Michele
>  
> On 08/01/2013 11:48 AM, Barry Smith wrote:
>>   Do you use KSPSetDM(ksp,da);  ?  See src/ksp/ksp/examples/tutorials/ex19.c
>> 
>>    Barry
>> 
>> On Aug 1, 2013, at 1:35 PM, Michele Rosso 
>> <mro...@uci.edu>
>>  wrote:
>> 
>> 
>>> Barry,
>>> 
>>> I am using a finite difference Cartesian uniform grid and DMDA and so far 
>>> it has not given me any problem.
>>> I am using a ksp solver (not snes). In a previous thread, I was told an odd 
>>> number of grid points was needed for the geometric multigrid, is that 
>>> correct?
>>> I tried to run my case with
>>> 
>>> 
>>> -pc_type mg -da_refine 4
>>> 
>>> 
>>> 
>>> but it does not seem to use the -da_refine option:
>>> 
>>> mpiexec   -np 4 ./test  -pc_type mg -da_refine 4  -ksp_view -options_left
>>> 
>>> 
>>> KSP Object: 4 MPI processes
>>>  type: cg
>>>  maximum iterations=10000
>>>  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>>>  left preconditioning
>>>  using nonzero initial guess
>>>  using UNPRECONDITIONED norm type for convergence test
>>> PC Object: 4 MPI processes
>>>  type: mg
>>>    MG: type is MULTIPLICATIVE, levels=1 cycles=v
>>>      Cycles per PCApply=1
>>>      Not using Galerkin computed coarse grid matrices
>>>  Coarse grid solver -- level -------------------------------
>>>    KSP Object:    (mg_levels_0_)     4 MPI processes
>>>      type: chebyshev
>>>        Chebyshev: eigenvalue estimates:  min = 0.134543, max = 1.47998
>>>        Chebyshev: estimated using:  [0 0.1; 0 1.1]
>>>        KSP Object:        (mg_levels_0_est_)         4 MPI processes
>>>          type: gmres
>>>            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
>>> Orthogonalization with no iterative refinement
>>>            GMRES: happy breakdown tolerance 1e-30
>>>          maximum iterations=10, initial guess is zero
>>>          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>          left preconditioning
>>>          using NONE norm type for convergence test
>>>        PC Object:        (mg_levels_0_)         4 MPI processes
>>>          type: sor
>>>            SOR: type = local_symmetric, iterations = 1, local iterations = 
>>> 1, omega = 1
>>>          linear system matrix = precond matrix:
>>>          Matrix Object:           4 MPI processes
>>>            type: mpiaij
>>>            rows=262144, cols=262144
>>>            total: nonzeros=1835008, allocated nonzeros=1835008
>>>            total number of mallocs used during MatSetValues calls =0
>>>      maximum iterations=1, initial guess is zero
>>>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>      left preconditioning
>>>      using NONE norm type for convergence test
>>>    PC Object:    (mg_levels_0_)     4 MPI processes
>>>      type: sor
>>>        SOR: type = local_symmetric, iterations = 1, local iterations = 1, 
>>> omega = 1
>>>      linear system matrix = precond matrix:
>>>      Matrix Object:       4 MPI processes
>>>        type: mpiaij
>>>        rows=262144, cols=262144
>>>        total: nonzeros=1835008, allocated nonzeros=1835008
>>>        total number of mallocs used during MatSetValues calls =0
>>>  linear system matrix = precond matrix:
>>>  Matrix Object:   4 MPI processes
>>>    type: mpiaij
>>>    rows=262144, cols=262144
>>>    total: nonzeros=1835008, allocated nonzeros=1835008
>>>    total number of mallocs used during MatSetValues calls =0
>>> Solution       =    1.53600013     sec
>>> #PETSc Option Table entries:
>>> -da_refine 4
>>> -ksp_view
>>> -options_left
>>> -pc_type mg
>>> #End of PETSc Option Table entries
>>> There is one unused database option. It is:
>>> Option left: name:-da_refine value: 4
>>> 
>>> Michele
>>> 
>>> On 08/01/2013 11:21 AM, Barry Smith wrote:
>>> 
>>>>    What kind of mesh are you using? Are you using DMDA? If you are using 
>>>> DMDA (and have written your code to use it "correctly") then it should be 
>>>> trivial to run with geometric multigrid and geometric multigrid should be 
>>>> a bit faster.
>>>> 
>>>>    For example on src/snes/examples/tutorials/ex19.c   I run with ./ex19 
>>>> -pc_type mg -da_refine 4 and it refines the original DMDA 4 times and uses 
>>>> geometric multigrid with 5 levels.
>>>> 
>>>> 
>>>>    Barry
>>>> 
>>>> 
>>>> On Aug 1, 2013, at 1:14 PM, Michele Rosso 
>>>> <mro...@uci.edu>
>>>>  wrote:
>>>> 
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am successfully using PETSc (v3.4.2)  to solve a 3D Poisson's equation 
>>>>> with CG + GAMG as I was suggested to do in a previous thread.
>>>>> So far I am using GAMG with the default settings, i.e.
>>>>> 
>>>>> -pc_type gamg -pc_gamg_agg_nsmooths 1
>>>>> 
>>>>> The speed of the solution is satisfactory, but I would like to know if 
>>>>> you have any suggestions to further speed it up, particularly
>>>>> if there is any parameters worth looking into to achieve an even faster 
>>>>> solution, for example number of levels and so on.
>>>>> So far I am using Dirichlet's BCs for my test case, but I will soon have 
>>>>> periodic conditions: in this case, does GAMG require particular settings?
>>>>> Finally, I did not try geometric multigrid: do you think it is worth a 
>>>>> shot?
>>>>> 
>>>>> Here are my current settings:
>>>>> 
>>>>> I run with
>>>>> 
>>>>> -pc_type gamg -pc_gamg_agg_nsmooths 1 -ksp_view -options_left
>>>>> 
>>>>> and the output is:
>>>>> 
>>>>> KSP Object: 4 MPI processes
>>>>>   type: cg
>>>>>   maximum iterations=10000
>>>>>   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>>>>>   left preconditioning
>>>>>   using nonzero initial guess
>>>>>   using UNPRECONDITIONED norm type for convergence test
>>>>> PC Object: 4 MPI processes
>>>>>   type: gamg
>>>>>     MG: type is MULTIPLICATIVE, levels=3 cycles=v
>>>>>       Cycles per PCApply=1
>>>>>       Using Galerkin computed coarse grid matrices
>>>>>   Coarse grid solver -- level -------------------------------
>>>>>     KSP Object:    (mg_coarse_)     4 MPI processes
>>>>>       type: preonly
>>>>>       maximum iterations=1, initial guess is zero
>>>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>       left preconditioning
>>>>>       using NONE norm type for convergence test
>>>>>     PC Object:    (mg_coarse_)     4 MPI processes
>>>>>       type: bjacobi
>>>>>         block Jacobi: number of blocks = 4
>>>>>         Local solve info for each block is in the following KSP and PC 
>>>>> objects:
>>>>>       [0] number of local blocks = 1, first local block number = 0
>>>>>                 [0] local block number 0
>>>>> KSP Object:          (mg_coarse_sub_)         1 MPI processes
>>>>>           type: preonly
>>>>>           maximum iterations=1, initial guess is zero
>>>>>                 tolerances:  relative=1e-05, absolute=1e-50, 
>>>>> divergence=10000
>>>>> KSP Object:        (mg_coarse_sub_)            left preconditioning
>>>>>           using NONE norm type for convergence test
>>>>>           PC Object:        (mg_coarse_sub_)       1 MPI processes
>>>>>           type: preonly
>>>>>          1 MPI processes
>>>>>           type: lu
>>>>>           maximum iterations=1, initial guess is zero
>>>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>           LU: out-of-place factorization
>>>>>             left preconditioning
>>>>>           using NONE norm type for convergence test
>>>>>           PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>>>           type: lu
>>>>>           tolerance for zero pivot 2.22045e-14
>>>>>             using diagonal shift on blocks to prevent zero pivot
>>>>>             matrix ordering: nd
>>>>>             LU: out-of-place factorization
>>>>>             tolerance for zero pivot 2.22045e-14
>>>>>             using diagonal shift on blocks to prevent zero pivot
>>>>>             matrix ordering: nd
>>>>>             factor fill ratio given 5, needed 0
>>>>>               Factored matrix follows:
>>>>>             factor fill ratio given 5, needed 4.13207
>>>>>               Factored matrix follows:
>>>>>                   Matrix Object:              Matrix Object:              
>>>>>    1 MPI processes
>>>>>                   type: seqaij
>>>>>                     rows=395, cols=395
>>>>>                     package used to perform factorization: petsc
>>>>>                   total: nonzeros=132379, allocated nonzeros=132379
>>>>>                   total number of mallocs used during MatSetValues calls 
>>>>> =0
>>>>>                         not using I-node routines
>>>>>            1 MPI processes
>>>>>                   type: seqaij
>>>>>           linear system matrix = precond matrix:
>>>>>                     rows=0, cols=0
>>>>>                     package used to perform factorization: petsc
>>>>>                   total: nonzeros=1, allocated nonzeros=1
>>>>>                     total number of mallocs used during MatSetValues 
>>>>> calls =0
>>>>>                       not using I-node routines
>>>>>               linear system matrix = precond matrix:
>>>>>   Matrix Object:             1 MPI processes
>>>>>             type: seqaij
>>>>>           Matrix Object:KSP Object:           1 MPI processes
>>>>>             type: seqaij
>>>>>             rows=0, cols=0
>>>>>             total: nonzeros=0, allocated nonzeros=0
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>                 not using I-node routines
>>>>>           rows=395, cols=395
>>>>>             total: nonzeros=32037, allocated nonzeros=32037
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>               not using I-node routines
>>>>>           - - - - - - - - - - - - - - - - - -
>>>>>           KSP Object:        (mg_coarse_sub_)         1 MPI processes
>>>>>           type: preonly
>>>>>           maximum iterations=1, initial guess is zero
>>>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>           left preconditioning
>>>>>           using NONE norm type for convergence test
>>>>>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>>>           type: lu
>>>>>             LU: out-of-place factorization
>>>>>             tolerance for zero pivot 2.22045e-14
>>>>>             using diagonal shift on blocks to prevent zero pivot
>>>>>             matrix ordering: nd
>>>>>             factor fill ratio given 5, needed 0
>>>>>               Factored matrix follows:
>>>>>                 Matrix Object:                 1 MPI processes
>>>>>                   type: seqaij
>>>>>                   rows=0, cols=0
>>>>>                   package used to perform factorization: petsc
>>>>>                   total: nonzeros=1, allocated nonzeros=1
>>>>>                   total number of mallocs used during MatSetValues calls 
>>>>> =0
>>>>>                     not using I-node routines
>>>>>           linear system matrix = precond matrix:
>>>>>           Matrix Object:           1 MPI processes
>>>>>             type: seqaij
>>>>>             rows=0, cols=0
>>>>>             total: nonzeros=0, allocated nonzeros=0
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>               not using I-node routines
>>>>>   (mg_coarse_sub_)         1 MPI processes
>>>>>           type: preonly
>>>>>           maximum iterations=1, initial guess is zero
>>>>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>           left preconditioning
>>>>>           using NONE norm type for convergence test
>>>>>         PC Object:        (mg_coarse_sub_)         1 MPI processes
>>>>>           type: lu
>>>>>             LU: out-of-place factorization
>>>>>             tolerance for zero pivot 2.22045e-14
>>>>>             using diagonal shift on blocks to prevent zero pivot
>>>>>             matrix ordering: nd
>>>>>             factor fill ratio given 5, needed 0
>>>>>               Factored matrix follows:
>>>>>                 Matrix Object:                 1 MPI processes
>>>>>                   type: seqaij
>>>>>                   rows=0, cols=0
>>>>>                   package used to perform factorization: petsc
>>>>>                   total: nonzeros=1, allocated nonzeros=1
>>>>>                   total number of mallocs used during MatSetValues calls 
>>>>> =0
>>>>>                     not using I-node routines
>>>>>           linear system matrix = precond matrix:
>>>>>           Matrix Object:           1 MPI processes
>>>>>             type: seqaij
>>>>>             rows=0, cols=0
>>>>>             total: nonzeros=0, allocated nonzeros=0
>>>>>             total number of mallocs used during MatSetValues calls =0
>>>>>               not using I-node routines
>>>>>       [1] number of local blocks = 1, first local block number = 1
>>>>>         [1] local block number 0
>>>>>         - - - - - - - - - - - - - - - - - -
>>>>>       [2] number of local blocks = 1, first local block number = 2
>>>>>         [2] local block number 0
>>>>>         - - - - - - - - - - - - - - - - - -
>>>>>       [3] number of local blocks = 1, first local block number = 3
>>>>>         [3] local block number 0
>>>>>         - - - - - - - - - - - - - - - - - -
>>>>>       linear system matrix = precond matrix:
>>>>>       Matrix Object:       4 MPI processes
>>>>>         type: mpiaij
>>>>>         rows=395, cols=395
>>>>>         total: nonzeros=32037, allocated nonzeros=32037
>>>>>         total number of mallocs used during MatSetValues calls =0
>>>>>           not using I-node (on process 0) routines
>>>>>   Down solver (pre-smoother) on level 1 -------------------------------
>>>>>     KSP Object:    (mg_levels_1_)     4 MPI processes
>>>>>       type: chebyshev
>>>>>         Chebyshev: eigenvalue estimates:  min = 0.0636225, max = 1.33607
>>>>>       maximum iterations=2
>>>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>       left preconditioning
>>>>>       using nonzero initial guess
>>>>>       using NONE norm type for convergence test
>>>>>     PC Object:    (mg_levels_1_)     4 MPI processes
>>>>>       type: jacobi
>>>>>       linear system matrix = precond matrix:
>>>>>       Matrix Object:       4 MPI processes
>>>>>         type: mpiaij
>>>>>         rows=23918, cols=23918
>>>>>         total: nonzeros=818732, allocated nonzeros=818732
>>>>>         total number of mallocs used during MatSetValues calls =0
>>>>>           not using I-node (on process 0) routines
>>>>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>>>>   Down solver (pre-smoother) on level 2 -------------------------------
>>>>>     KSP Object:    (mg_levels_2_)     4 MPI processes
>>>>>       type: chebyshev
>>>>>         Chebyshev: eigenvalue estimates:  min = 0.0971369, max = 2.03987
>>>>>       maximum iterations=2
>>>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>>>       left preconditioning
>>>>>       using nonzero initial guess
>>>>>       using NONE norm type for convergence test
>>>>>     PC Object:    (mg_levels_2_)     4 MPI processes
>>>>>       type: jacobi
>>>>>       linear system matrix = precond matrix:
>>>>>       Matrix Object:       4 MPI processes
>>>>>         type: mpiaij
>>>>>         rows=262144, cols=262144
>>>>>         total: nonzeros=1835008, allocated nonzeros=1835008
>>>>>         total number of mallocs used during MatSetValues calls =0
>>>>>   Up solver (post-smoother) same as down solver (pre-smoother)
>>>>>   linear system matrix = precond matrix:
>>>>>   Matrix Object:   4 MPI processes
>>>>>     type: mpiaij
>>>>>     rows=262144, cols=262144
>>>>>     total: nonzeros=1835008, allocated nonzeros=1835008
>>>>>     total number of mallocs used during MatSetValues calls =0
>>>>> #PETSc Option Table entries:
>>>>> -ksp_view
>>>>> -options_left
>>>>> -pc_gamg_agg_nsmooths 1
>>>>> -pc_type gamg
>>>>> #End of PETSc Option Table entries
>>>>> There are no unused options.
>>>>> 
>>>>> 
>>>>> Thank you,
>>>>> Michele
>>>>> 
>> 
>

Re: [petsc-users] GAMG speed

Reply via email to