Re: [petsc-users] Triple increasing of allocated memory during KSPSolve calling(GMRES preconditioned by ASM)

Smith, Barry F. via petsc-users Wed, 05 Feb 2020 12:17:20 -0800


> On Feb 5, 2020, at 9:03 AM, Дмитрий Мельничук 
> <[email protected]> wrote:
> 
> Barry, appreciate your response, as always.
>  
> - You are saying that I am using ASM + ILU(0). However, I use PETSc only with 
> "ASM" as the input parameter for preconditioner. Does it mean that ILU(0) is 
> default sub-preconditioner for ASM?


   Yes

> Can I change it using the option "-sub_pc_type"?

  Yes -sub_pc_type for   then it will use SOR on each block instead of ILU 
saves a matrix.

> Does it make sense to you within the scope of my general goal, which is 
> memory consumption decrease? Can it be useful to vary the "-sub_ksp_type" 
> option?

   Probably not.
>  
> - I have run the computation for the same initial matrix with the 
> "-sub_pc_factor_in_place" option, PC = ASM. Now the process consumed ~400 MB 
> comparing to 550 MB without this option.

   This is what I expected, good.

> I used "-ksp_view" for this computation, two logs for this computation are 
> attached:
> "ksp_view.txt"  - ksp_view option only
> "full_log_ASM_factor_in_place.txt" - full log without ksp_view option
>  
> - Then I changed primary preconditioner from ASM to ILU(0) and ran the 
> computation: memory consumption was again about ~400 MB, no matter if I use 
> the "-sub_pc_factor_in_place" option.
>  
> - Then I tried to run the computation with ILU(0) and "-pc_factor_in_place", 
> just in case: the computation did not start, I got an error message, the log 
> is attached: "Error_ilu_pc_factor.txt"

   Since that matrix is used for the MatMuilt you cannot do the factorization 
in  place since it replaces the original matrix entries with the factorization 
matrix entries


>  
> - Then I ran the computation with SOR as a preconditioner. PETSc gave me an 
> error message, the log is attached: "Error_gmres_sor.txt"

   This is because our SOR cannot handle zeros on the diagonal.

>  
> - As for the kind of PDEs: I am solving the standard poroelasticity problem, 
> the formulation can be found in the attached paper 
> (Zheng_poroelasticity.pdf), pages 2-3.
> The file PDE.jpg is a snapshot of PDEs from this paper.
>  
>  
> So, if you may give me any further advice on how to decrease the consumed 
> amount of memory to approximately the matrix size (~200 MB in this case), it 
> will be great. Do I need to focus on searching a proper preconditioner? BTW, 
> the single ILU(0) did not give me any memory advantage comparing to ASM with 
> "-sub_pc_factor_in_place".

   Yes, because in both cases you need two copies of the matrix, for the 
multiple and for the ILU. But you want a preconditioner that doesn't require 
any new matrices in the preconditioner. This is difficult. You want an 
efficient preconditioner that requires essentially no additional memory?

   -ksp_type gmres  or bcgs -pc_type jacobi   (the sor won't work because the 
zero diagonals)   It will not be good preconditioner. Are you sure you don't 
have additional memory for the preconditioner? A good preconditioner  might 
require up to 5  to 6 the memory of the original matrix.


>  
> Have a pleasant day!
>  
> Kind regards,
> Dmitry
>  
>  
>  
> 04.02.2020, 19:04, "Smith, Barry F." <[email protected]>:
> 
>    Please run with the option -ksp_view so we know the exact solver options 
> you are using.
> 
>    From the lines
> 
> MatCreateSubMats 1 1.0 1.9397e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 
> 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 1.1066e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 
> 0 0 0 0 0 0 0 0 0
> MatIncreaseOvrlp 1 1.0 3.0324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 
> 0 0 0 0 0 0 0 0 0 0
> 
>    and the fact you have three matrices I would guess you are using the 
> additive Schwarz preconditioner (ASM) with ILU(0) on the blocks. (which 
> converges the same as ILU on one process but does use much more memory).
> 
>    Note: your code is still built with 32 bit integers.
> 
>    I would guess the basic matrix formed plus the vectors in this example 
> could take ~200 MB. It is the two matrices in the additive Schwarz that is 
> taking the additional memory.
> 
>    What kind of PDEs are you solving and what kind of formulation?
> 
>    ASM plus ILU is the "work mans" type preconditioner, relatively robust but 
> not particularly fast for convergence. Depending on your problem you might be 
> able to do much better convergence wise by using a PCFIELDSPLIT and a PCGAMG 
> on one of the splits. In your own run you see the ILU is chugging along 
> rather slowly to the solution.
> 
>    With your current solvers you can use the option -sub_pc_factor_in_place 
> which will shave off one of the matrices memories. Please try that.
> 
>    Avoiding the ASM you can avoid both extra matrices but at the cost of even 
> slower convergence. Use, for example -pc_type sor
> 
> 
>     The petroleum industry also has a variety of "custom" 
> preconditioners/solvers for particular models and formulations that can beat 
> the convergence of general purpose solvers; and require less memory. Some of 
> these can be implemented or simulated with PETSc. Some of these are 
> implemented in the commercial petroleum simulation codes and it can be 
> difficult to get a handle on exactly what they do because of proprietary 
> issues. I think I have an old text on these approaches in my office, there 
> may be modern books that discuss these.
> 
> 
>    Barry
> 
> 
>  
> 
>  On Feb 4, 2020, at 6:04 AM, Дмитрий Мельничук 
> <[email protected]> wrote:
> 
>  Hello again!
>  Thank you very much for your replies!
>  Log is attached.
> 
>  1. The main problem now is following. To solve the matrix that is attached 
> to my previous e-mail PETSc consumes ~550 MB.
>  I know definitely that there are commercial softwares in petroleum industry 
> (e.g., Schlumberger Petrel) that solve the same initial problem consuming 
> only ~200 MB.
>  Moreover, I am sure that when I used 32-bit PETSc (GMRES + ASM) a year ago, 
> it also consumed ~200 MB for this matrix.
> 
>  So, my question is: do you have any advice on how to decrease the amount of 
> RAM consumed for such matrix from 550 MB to 200 MB? Maybe some specific 
> preconditioner or other ways?
> 
>  I will be very grateful for any thoughts!
> 
>  2. The second problem is more particular.
>  According to resource manager in Windows 10, Fortran solver based on PETSc 
> consumes 548 MB RAM while solving the system of linear equations.
>  As I understand it form logs, it is required 459 MB and 52 MB for matrix and 
> vector storage respectively. After summing of all objects for which memory is 
> allocated we get only 517 MB.
> 
>  Thank you again for your time! Have a nice day.
> 
>  Kind regards,
>  Dmitry
> 
> 
>  03.02.2020, 19:55, "Smith, Barry F." <[email protected]>:
> 
>     GMRES also can by default require about 35 work vectors if it reaches the 
> full restart. You can set a smaller restart with -ksp_gmres_restart 15 for 
> example but this can also hurt the convergence of GMRES dramatically. People 
> sometimes use the KSPBCGS algorithm since it does not require all the restart 
> vectors but it can also converge more slowly.
> 
>      Depending on how much memory the sparse matrices use relative to the 
> vectors the vector memory may matter or not.
> 
>     If you are using a recent version of PETSc you can run with -log_view 
> -log_view_memory and it will show on the right side of the columns how much 
> memory is being allocated for each of the operations in various ways.
> 
>     Barry
> 
> 
> 
>   On Feb 3, 2020, at 10:34 AM, Matthew Knepley <[email protected]> wrote:
> 
>   On Mon, Feb 3, 2020 at 10:38 AM Дмитрий Мельничук 
> <[email protected]> wrote:
>   Hello all!
> 
>   Now I am faced with a problem associated with the memory allocation when 
> calling of KSPSolve .
> 
>   GMRES preconditioned by ASM for solving linear algebra system (obtained by 
> the finite element spatial discretisation of Biot poroelasticity model) was 
> chosen.
>   According to the output value of PetscMallocGetCurrentUsage subroutine 176 
> MB for matrix and RHS vector storage is required (before KSPSolve calling).
>   But during solving linear algebra system 543 MB of RAM is required (during 
> KSPSolve calling).
>   Thus, the amount of allocated memory after preconditioning stage increased 
> three times. This kind of behaviour is critically for 3D models with several 
> millions of cells.
> 
>   1) In order to know anything, we have to see the output of -ksp_view, 
> although I see you used an overlap of 2
> 
>   2) The overlap increases the size of submatrices beyond that of the 
> original matrix. Suppose that you used LU for the sub-preconditioner.
>       You would need at least 2x memory (with ILU(0)) since the matrix 
> dominates memory usage. Moreover, you have overlap
>       and you might have fill-in depending on the solver.
> 
>   3) The massif tool from valgrind is a good fine-grained way to look at 
> memory allocation
> 
>     Thanks,
> 
>        Matt
> 
>   Is there a way to decrease amout of allocated memory?
>   Is that an expected behaviour for GMRES-ASM combination?
> 
>   As I remember, using previous version of PETSc didn't demonstrate so 
> significante memory increasing.
> 
>   ...
>   Vec :: Vec_F, Vec_U
>   Mat :: Mat_K
>   ...
>   ...
>   call MatAssemblyBegin(Mat_M,Mat_Final_Assembly,ierr)
>   call MatAssemblyEnd(Mat_M,Mat_Final_Assembly,ierr)
>   ....
>   call VecAssemblyBegin(Vec_F_mod,ierr)
>   call VecAssemblyEnd(Vec_F_mod,ierr)
>   ...
>   ...
>   call PetscMallocGetCurrentUsage(mem, ierr)
>   print *,"Memory used: ",mem
>   ...
>   ...
>   call KSPSetType(Krylov,KSPGMRES,ierr)
>   call KSPGetPC(Krylov,PreCon,ierr)
>   call PCSetType(PreCon,PCASM,ierr)
>   call KSPSetFromOptions(Krylov,ierr)
>   ...
>   call KSPSolve(Krylov,Vec_F,Vec_U,ierr)
>   ...
>   ...
>   options = "-pc_asm_overlap 2 -pc_asm_type basic -ksp_monitor 
> -ksp_converged_reason"
> 
> 
>   Kind regards,
>   Dmitry Melnichuk
>   Matrix.dat (265288024)
> 
> 
>   --
>   What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which their 
> experiments lead.
>   -- Norbert Wiener
> 
>   https://www.cse.buffalo.edu/~knepley/
> 
> 
>  <Logs_26K_GMRES-ASM-log_view-log_view_memory-malloc_dump_32bit>
>  
> <ksp_view.txt><PDE.JPG><Zheng_poroelasticity.pdf><full_log_ASM_factor_in_place.txt><Error_gmres_sor.txt><Error_ilu_pc_factor.txt>

Re: [petsc-users] Triple increasing of allocated memory during KSPSolve calling(GMRES preconditioned by ASM)

Reply via email to