Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG

Christopher, Joshua via petsc-users Tue, 21 Mar 2023 07:28:47 -0700

Hi Matt,

Sorry for the unclear explanation. My layout is like this:


Proc 0: Rows 0--499 and rows 1000--1499
Proc 1: Rows 500-999 and rows 1500-1999

I have two unknowns, rho and phi, both correspond to a contiguous chunk of rows.

Phi: Rows 0-999
Rho: Rows 1000-1999

My source data (an OpenFOAM matrix) has the unknowns row-contiguous, which is 
why my layout is like this. My understanding is that my IS are set up correctly 
to match this matrix structure, which is why I am uncertain why I am getting 
the error message. I attached the output of my IS in my previous message.

Thank you,
Joshua
________________________________
From: Matthew Knepley <[email protected]>
Sent: Monday, March 20, 2023 6:16 PM
To: Christopher, Joshua <[email protected]>
Cc: Barry Smith <[email protected]>; [email protected] 
<[email protected]>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG

On Mon, Mar 20, 2023 at 6:45 PM Christopher, Joshua via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:
Hi Barry and Mark,

Thank you for your responses. I implemented the index sets in my application 
and it appears to work in serial. Unfortunately I am having some trouble 
running in parallel. The error I am getting is:
[1]PETSC ERROR: Petsc has generated inconsistent data
[1]PETSC ERROR: Number of entries found in complement 1000 does not match 
expected 500
1]PETSC ERROR: #1 ISComplement() at 
petsc-3.16.5/src/vec/is/is/utils/iscoloring.c:837
[1]PETSC ERROR: #2 PCSetUp_FieldSplit() at 
petsc-3.16.5/src/ksp/pc/impls/fieldsplit/fieldsplit.c:882
[1]PETSC ERROR: #3 PCSetUp() at petsc-3.16.5/src/ksp/pc/interface/precon.c:1017
[1]PETSC ERROR: #4 KSPSetUp() at petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:408
[1]PETSC ERROR: #5 KSPSolve_Private() at 
petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:852
[1]PETSC ERROR: #6 KSPSolve() at 
petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:1086
[1]PETSC ERROR: #7 solvePetsc() at coupled/coupledSolver.C:612

I am testing with two processors and a 2000x2000 matrix. I have two fields, phi 
and rho. The matrix has rows 0-999 for phi and rows 1000-1999 for rho. Proc0 
has rows 0-499 and 1000-1499 while proc1 has rows 500-999 and 1500-1999. I've 
attached the ASCII printout of the IS for phi and rho. Am I right thinking that 
I have some issue with my IS layouts?

I do not understand your explanation. Your matrix is 2000x2000, and I assume 
split so that

  proc 0 has rows 0       --   999
  proc 1 has rows 1000 -- 1999

Now, when you call PCFieldSplitSetIS(), each process gives an IS which 
indicates the dofs _owned by that process_ the contribute to field k. If you
do not give unknowns within the global row bounds for that process, the 
ISComplement() call will not work.

Of course, we should check that the entries are not out of bounds when they are 
submitted. if you want to do it, it would be a cool submission.

   Thanks,

      Matt

Thank you,
Joshua


________________________________
From: Barry Smith <[email protected]<mailto:[email protected]>>
Sent: Friday, March 17, 2023 1:22 PM
To: Christopher, Joshua <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG



On Mar 17, 2023, at 1:26 PM, Christopher, Joshua 
<[email protected]<mailto:[email protected]>> wrote:

Hi Barry,

Thank you for your response. I'm a little confused about the relation between 
the IS integer values and matrix indices. 
Fromhttps://petsc.org/release/src/snes/tutorials/ex70.c.html it looks like my 
IS should just contain a list of the rows for each split? For example, if I 
have a 100x100 matrix with two fields, "rho" and "phi", the first 50 rows 
correspond to the "rho" variable and the last 50 correspond to the "phi" 
variable. So I should call PCFieldSplitSetIS twice, the first with an IS 
containing integers 0-49 and the second with integers 49-99? PCFieldSplitSetIS 
is expecting global row numbers, correct?

  As Mark said, yes this sounds fine.

My matrix is organized as one block after another.

   When you are running in parallel with MPI, how will you organize the 
unknowns? Will you have 25 of the rho followed by 25 of phi on each MPI 
process? You will need to take this into account when you build the IS on each 
MPI process.

  Barry



Thank you,
Joshua
________________________________
From: Barry Smith <[email protected]<mailto:[email protected]>>
Sent: Tuesday, March 14, 2023 1:35 PM
To: Christopher, Joshua <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG


  You definitely do not need to use a complicated DM to take advantage of 
PCFIELDSPLIT. All you need to do is create two IS on each MPI process. The 
first should list all the indices of the degrees of freedom of your first type 
of variable and the second should list all the rest of the degrees of freedom. 
Then use https://petsc.org/release/docs/manualpages/PC/PCFieldSplitSetIS/

  Barry

Note: PCFIELDSPLIT does not care how you have ordered your degrees of freedom 
of the two types. You might interlace them or have all the first degree of 
freedom on an MPI process and then have all the second degree of freedom. This 
just determines what your IS look like.



On Mar 14, 2023, at 1:14 PM, Christopher, Joshua via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:

Hello PETSc users,

I haven't heard back from the library developer regarding the numbering issue 
or my questions on using field split operators with their library, so I need to 
fix this myself.

Regarding the natural numbering vs parallel numbering: I haven't figured out 
what is wrong here. I stepped through in parallel and it looks like each 
processor is setting up the matrix and calling MatSetValue similar to what is 
shown in https://petsc.org/release/src/ksp/ksp/tutorials/ex2.c.html. I see that 
PETSc is recognizing my simple two-processor test from the output 
("PetscInitialize_Common(): PETSc successfully started: number of processors = 
2"). I'll keep poking at this, however I'm very new to PETSc. When I print the 
matrix to ASCII using PETSC_VIEWER_DEFAULT, I'm guessing I see one row per 
line, and the tuples consists of the column number and value?

On the FieldSplit preconditioner, is my understanding here correct:

To use FieldSplit, I must have a DM. Since I have an unstructured mesh, I must 
use DMPlex and set up the chart and covering relations specific to my mesh 
following here: https://petsc.org/release/docs/manual/dmplex/. I think this may 
be very time-consuming for me to set up.

Currently, I already have a matrix stored in a parallel sparse L-D-U format. I 
am converting into PETSc's sparse parallel AIJ matrix (traversing my matrix and 
using MatSetValues). The weights for my discretization scheme are already 
accounted for in the coefficients of my L-D-U matrix. I do have the submatrices 
in L-D-U format for each of my two equations' coupling with each other. That 
is, the equivalent of lines 242,251-252,254 of example 28 
https://petsc.org/release/src/snes/tutorials/ex28.c.html. Could I directly 
convert my submatrices into PETSc's sub-matrix here, then assemble things 
together so that the field split preconditioners will work?

Alternatively, since my L-D-U matrices already account for the discretization 
scheme, can I use a simple structured grid DM?

Thank you so much for your help!
Regards,
Joshua
________________________________
From: Pierre Jolivet <[email protected]<mailto:[email protected]>>
Sent: Friday, March 3, 2023 11:45 AM
To: Christopher, Joshua <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG

For full disclosure, with -ksp_pc_side right -ksp_max_it 100 -ksp_rtol 1E-10:
1) with renumbering via ParMETIS
-pc_type bjacobi -sub_pc_type lu -sub_pc_factor_mat_solver_type mumps => Linear 
solve converged due to CONVERGED_RTOL iterations 10
-pc_type hypre -pc_hypre_boomeramg_relax_type_down l1-Gauss-Seidel 
-pc_hypre_boomeramg_relax_type_up backward-l1-Gauss-Seidel => Linear solve 
converged due to CONVERGED_RTOL iterations 55
2) without renumbering via ParMETIS
-pc_type bjacobi => Linear solve did not converge due to DIVERGED_ITS 
iterations 100
-pc_type hypre => Linear solve did not converge due to DIVERGED_ITS iterations 
100
Using on outer fieldsplit may help fix this.

Thanks,
Pierre

On 3 Mar 2023, at 6:24 PM, Christopher, Joshua via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:

I am solving these equations in the context of electrically-driven fluid flows 
as that first paper describes. I am using a PIMPLE scheme to advance the fluid 
equations in time, and my goal is to do a coupled solve of the electric 
equations similar to what is described in this paper: 
https://www.sciencedirect.com/science/article/pii/S0045793019302427. They are 
using the SIMPLE scheme in this paper. My fluid flow should eventually reach 
steady behavior, and likewise the time derivative in the charge density should 
trend towards zero. They preferred using BiCGStab with a direct LU 
preconditioner for solving their electric equations. I tried to test that 
combination, but my case is halting for unknown reasons in the middle of the 
PETSc solve. I'll try with more nodes and see if I am running out of memory, 
but the computer is a little overloaded at the moment so it may take a while to 
run.

I sent Pierre Jolivet my matrix and RHS, and they said the matrix does not 
appear to be following a parallel numbering, and instead looks like the matrix 
has natural numbering. When they renumbered the system with ParMETIS they got 
really fast convergence. I am using PETSc through a library, so I will reach 
out to the library authors and see if there is an issue in the library.

Thank you,
Joshua
________________________________
From: Barry Smith <[email protected]<mailto:[email protected]>>
Sent: Thursday, March 2, 2023 3:47 PM
To: Christopher, Joshua <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG




<Untitled.png>

  Are you solving this as a time-dependent problem? Using an implicit scheme 
(like backward Euler) for rho ? In ODE language, solving the differential 
algebraic equation?

Is epsilon bounded away from 0?

On Mar 2, 2023, at 4:22 PM, Christopher, Joshua 
<[email protected]<mailto:[email protected]>> wrote:

Hi Barry and Mark,

Thank you for looking into my problem. The two equations I am solving with 
PETSc are equations 6 and 7 from this 
paper:https://ris.utwente.nl/ws/portalfiles/portal/5676495/Roghair+Paper_final_draft_v1.pdf

I just used MUMPS and SuperLU_DIST on my full-size problem (with 3,000,000 
unknowns). To clarify, I did a direct solve with -ksp_type preonly. They take a 
very long time, about 30 minutes for MUMPS and 18 minutes for SuperLU_DIST, see 
attached output. For reference, the same matrix took 658 iterations of 
BoomerAMG and about 20 seconds of walltime. Maybe I am already getting a great 
deal with BoomerAMG!

I'll try removing some terms from my solve (e.g. removing the second equation, 
then making the second equation just the elliptic portion of the equation, 
etc.) and try with a simpler geometry. I'll keep you updated as I run into 
troubles with that route. I wasn't aware of Field Split preconditioners, I'll 
do some reading on them and give them a try as well.

Thank you again,
Joshua
________________________________

From: Barry Smith <[email protected]<mailto:[email protected]>>
Sent: Thursday, March 2, 2023 7:47 AM
To: Christopher, Joshua <[email protected]<mailto:[email protected]>>
Cc: [email protected]<mailto:[email protected]> 
<[email protected]<mailto:[email protected]>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre 
BoomerAMG


  Have you tried MUMPS (or SuperLU_DIST) on the full-size problem with the 
5,000,000 unknowns? It is at the high end of problem sizes you can do with 
direct solvers but is worth comparing with  BoomerAMG. You likely want to use 
more nodes and fewer cores per node with MUMPs to be able to access more 
memory. If you are needing to solve multiple right hand sides but with the same 
matrix the factors will be reused resulting in the second and later solves 
being much faster.

  I agree with Mark, with iterative solvers you are likely to end up with 
PCFIELDSPLIT.

  Barry


On Mar 1, 2023, at 7:17 PM, Christopher, Joshua via petsc-users 
<[email protected]<mailto:[email protected]>> wrote:

Hello,

I am trying to solve the leaky-dielectric model equations with PETSc using a 
second-order discretization scheme (with limiting to first order as needed) 
using the finite volume method. The leaky dielectric model is a coupled system 
of two equations, consisting of a Poisson equation and a convection-diffusion 
equation.  I have tested on small problems with simple geometry (~1000 DoFs) 
using:

-ksp_type gmres
-pc_type hypre
-pc_hypre_type boomeramg

and I get RTOL convergence to 1.e-5 in about 4 iterations. I tested this in 
parallel with 2 cores, but also previously was able to use successfully use a 
direct solver in serial to solve this problem. When I scale up to my production 
problem, I get significantly worse convergence. My production problem has ~3 
million DoFs, more complex geometry, and is solved on ~100 cores across two 
nodes. The boundary conditions change a little because of the geometry, but are 
of the same classifications (e.g. only Dirichlet and Neumann). On the 
production case, I am needing 600-4000 iterations to converge. I've attached 
the output from the first solve that took 658 iterations to converge, using the 
following output options:

-ksp_view_pre
-ksp_view
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_test_null_space

My matrix is non-symmetric, the condition number can be around 10e6, and the 
eigenvalues reported by PETSc have been real and positive (using 
-ksp_view_eigenvalues).

I have tried using other preconditions (superlu, mumps, gamg, mg) but 
hypre+boomeramg has performed the best so far. The literature seems to indicate 
that AMG is the best approach for solving these equations in a coupled fashion.

Do you have any advice on speeding up the convergence of this system?

Thank you,
Joshua
<petsc_gmres_boomeramg.txt>

<petsc_preonly_mumps.txt><petsc_preonly_superlu.txt>



--
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG

Reply via email to