Re: [petsc-users] How to use DM_BOUNDARY_GHOSTED for Dirichlet boundary conditions

Barry Smith Tue, 28 Feb 2023 09:43:48 -0800

   I am sorry, I cannot reproduce what you describe. I am using 
src/ksp/ksp/tutorials/ex45.c in the main branch (should be same as release for 
this purpose).


   No change to the code I get 

$ ./ex45 -ksp_converged_reason -ksp_type richardson -ksp_rtol 1e-09 -pc_type mg 
-pc_mg_levels 3 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 6 
-mg_levels_ksp_converged_maxits -mg_levels_pc_type jacobi -mg_coarse_ksp_type 
richardson -mg_coarse_ksp_max_it 6 -mg_coarse_ksp_converged_maxits 
-mg_coarse_pc_type jacobi -ksp_monitor_true_residual -ksp_view
  0 KSP preconditioned resid norm 1.851257578045e+01 true resid norm 
1.476491378857e+01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 3.720545622095e-01 true resid norm 
5.171053311198e-02 ||r(i)||/||b|| 3.502257707188e-03
  2 KSP preconditioned resid norm 1.339047557616e-02 true resid norm 
1.866765310863e-03 ||r(i)||/||b|| 1.264325235890e-04
  3 KSP preconditioned resid norm 4.833887599029e-04 true resid norm 
6.867629264754e-05 ||r(i)||/||b|| 4.651316873974e-06
  4 KSP preconditioned resid norm 1.748167886388e-05 true resid norm 
3.398334857479e-06 ||r(i)||/||b|| 2.301628648933e-07
  5 KSP preconditioned resid norm 6.570567424652e-07 true resid norm 
4.304483984231e-07 ||r(i)||/||b|| 2.915346507180e-08
  6 KSP preconditioned resid norm 4.013427896557e-08 true resid norm 
7.502068698790e-08 ||r(i)||/||b|| 5.081010838410e-09
  7 KSP preconditioned resid norm 5.934811016347e-09 true resid norm 
1.333884145638e-08 ||r(i)||/||b|| 9.034147877457e-10
Linear solve converged due to CONVERGED_RTOL iterations 7
KSP Object: 1 MPI process
  type: richardson
    damping factor=1.
  maximum iterations=10000, nonzero initial guess
  tolerances:  relative=1e-09, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: mg
    type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Not using Galerkin computed coarse grid matrices
  Coarse grid solver -- level 0 -------------------------------
    KSP Object: (mg_coarse_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
    PC Object: (mg_coarse_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=8, cols=8
        total: nonzeros=32, allocated nonzeros=32
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (mg_levels_1_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_1_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=64, cols=64
        total: nonzeros=352, allocated nonzeros=352
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=343, cols=343
        total: nonzeros=2107, allocated nonzeros=2107
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 1 MPI process
    type: seqaij
    rows=343, cols=343
    total: nonzeros=2107, allocated nonzeros=2107
    total number of mallocs used during MatSetValues calls=0
      not using I-node routines
Residual norm 1.33388e-08
~/Src/petsc/src/ksp/ksp/tutorials (main=) arch-main
$ 


   Now change code with 

        if (i == 0 || j == 0 || k == 0 || i == mx - 1 || j == my - 1 || k == mz 
- 1) {
          barray[k][j][i] = 0; //2.0 * (HxHydHz + HxHzdHy + HyHzdHx);
        } else {
          barray[k][j][i] = 1; //Hx * Hy * Hz;
        }

  I do not understand where I am suppose to change the dimension to 33 so I 
ignore that statement. Same command line with change above gives

$ ./ex45 -ksp_converged_reason -ksp_type richardson -ksp_rtol 1e-09 -pc_type mg 
-pc_mg_levels 3 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 6 
-mg_levels_ksp_converged_maxits -mg_levels_pc_type jacobi -mg_coarse_ksp_type 
richardson -mg_coarse_ksp_max_it 6 -mg_coarse_ksp_converged_maxits 
-mg_coarse_pc_type jacobi -ksp_monitor_true_residual -ksp_view
  0 KSP preconditioned resid norm 7.292257119299e+01 true resid norm 
1.118033988750e+01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 2.534913491362e+00 true resid norm 
3.528425353826e-01 ||r(i)||/||b|| 3.155919577875e-02
  2 KSP preconditioned resid norm 9.145057509152e-02 true resid norm 
1.279725352471e-02 ||r(i)||/||b|| 1.144621152262e-03
  3 KSP preconditioned resid norm 3.302446009474e-03 true resid norm 
5.122622088691e-04 ||r(i)||/||b|| 4.581812485342e-05
  4 KSP preconditioned resid norm 1.204504429329e-04 true resid norm 
4.370692051248e-05 ||r(i)||/||b|| 3.909265814124e-06
  5 KSP preconditioned resid norm 5.339971695523e-06 true resid norm 
7.229991776815e-06 ||r(i)||/||b|| 6.466701235889e-07
  6 KSP preconditioned resid norm 5.856425044706e-07 true resid norm 
1.282860114273e-06 ||r(i)||/||b|| 1.147424968455e-07
  7 KSP preconditioned resid norm 1.007137752126e-07 true resid norm 
2.283009757390e-07 ||r(i)||/||b|| 2.041986004328e-08
  8 KSP preconditioned resid norm 1.790021892548e-08 true resid norm 
4.063263596129e-08 ||r(i)||/||b|| 3.634293444578e-09
Linear solve converged due to CONVERGED_RTOL iterations 8
KSP Object: 1 MPI process
  type: richardson
    damping factor=1.
  maximum iterations=10000, nonzero initial guess
  tolerances:  relative=1e-09, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI process
  type: mg
    type is MULTIPLICATIVE, levels=3 cycles=v
      Cycles per PCApply=1
      Not using Galerkin computed coarse grid matrices
  Coarse grid solver -- level 0 -------------------------------
    KSP Object: (mg_coarse_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
    PC Object: (mg_coarse_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=8, cols=8
        total: nonzeros=32, allocated nonzeros=32
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (mg_levels_1_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_1_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=64, cols=64
        total: nonzeros=352, allocated nonzeros=352
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 1 MPI process
      type: richardson
        damping factor=1.
      maximum iterations=6, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: seqaij
        rows=343, cols=343
        total: nonzeros=2107, allocated nonzeros=2107
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 1 MPI process
    type: seqaij
    rows=343, cols=343
    total: nonzeros=2107, allocated nonzeros=2107
    total number of mallocs used during MatSetValues calls=0
      not using I-node routines
Residual norm 4.06326e-08
~/Src/petsc/src/ksp/ksp/tutorials (main *=) arch-main
$ 

In neither case is it taking 25 iterations. What am I doing wrong?

Normally one expects only trivial changes in the convergence of multigrid 
methods when one changes values in the right hand side as with the run above.

Barry



> On Feb 27, 2023, at 7:16 PM, Paul Grosse-Bley 
> <[email protected]> wrote:
> 
> The scaling might be the problem, especially since I don't know what you mean 
> by scaling it according to FE.
> 
> For reproducing the issue with a smaller problem:
> Change the ComputeRHS function in ex45.c
> 
> if (i == 0 || j == 0 || k == 0 || i == mx - 1 || j == my - 1 || k == mz - 1) {
>   barray[k][j][i] = 0.0;
> } else {
>   barray[k][j][i] = 1.0;
> }
> 
> Change the dimensions to e.g. 33 (I scaled it down, so it goes quick without 
> a GPU) instead of 7 and then run with
> 
> -ksp_converged_reason -ksp_type richardson -ksp_rtol 1e-09 -pc_type mg 
> -pc_mg_levels 3 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 6 
> -mg_levels_ksp_converged_maxits -mg_levels_pc_type jacobi -mg_coarse_ksp_type 
> richardson -mg_coarse_ksp_max_it 6 -mg_coarse_ksp_converged_maxits 
> -mg_coarse_pc_type jacobi
> 
> You will find that it takes 145 iterations instead of 25 for the original 
> ex45 RHS. My hpgmg-cuda implementation (using 32^3) takes 41 iterations.
> 
> To what do I have to change the diagonal entries of the matrix for the 
> boundary according to FE? Right now the diagonal is completely constant.
> 
> Paul
> 
> On Tuesday, February 28, 2023 00:23 CET, Barry Smith <[email protected]> wrote:
>  
>> 
>> 
>> I have not seen explicitly including, or excluding, the Dirichlet boundary 
>> values in the system having a significant affect on the convergence so long 
>> as you SCALE the diagonal rows (of those Dirichlet points) by a value 
>> similar to the other entries along the diagonal. If they are scaled 
>> completely differently, that can screw up the convergence. For 
>> src/ksp/ksp/ex45.c I see that the appropriate scaling is used (note the 
>> scaling should come from a finite element view of the discretization even if 
>> the discretization is finite differences as is done in ex45.c)
>> 
>> Are you willing to share the two codes so we can take a look with 
>> experienced eyes to try to figure out the difference?
>> 
>> Barry
>> 
>> 
>> 
>> 
>> > On Feb 27, 2023, at 5:48 PM, Paul Grosse-Bley 
>> > <[email protected]> wrote:
>> >
>> > Hi Barry,
>> >
>> > the reason why I wanted to change to ghost boundaries is that I was 
>> > worrying about the effect of PCMGs coarsening on these boundary values.
>> >
>> > As mentioned before, I am trying to reproduce results from the hpgmg-cuda 
>> > benchmark (a modified version of it, e.g. using 2nd order instead of 4th 
>> > etc.).
>> > I am trying to solve the Poisson equation -\nabla^2 u = 1 with u = 0 on 
>> > the boundary with rtol=1e-9. While my MG solver implemented in hpgmg 
>> > solves this in 40 V-cycles (I weakened it a lot by only doing smooths at 
>> > the coarse level instead of CG). When I run the "same" MG solver built in 
>> > PETSc on this problem, it starts out reducing the residual norm as fast or 
>> > even faster for the first 20-30 iterations. But for the last order of 
>> > magnitude in the residual norm it needs more than 300 V-cycles, i.e. it 
>> > gets very slow. At this point I am pretty much out of ideas about what is 
>> > the cause, especially since e.g. adding back cg at the coarsest level 
>> > doesn't seem to change the number of iterations at all. Therefore I am 
>> > suspecting the discretization to be the problem. HPGMG uses an even number 
>> > of points per dimension (e.g. 256), while PCMG wants an odd number (e.g. 
>> > 257). So I also tried adding another layer of boundary values for the 
>> > discretization to effectively use only 254 points per dimension. This 
>> > caused the solver to get even slightly worse.
>> >
>> > So can the explicit boundary values screw with the coarsening, especially 
>> > when they are not finite? Because with the problem as stated in ex45 with 
>> > finite (i.e. non-zero) boundary values, the MG solver takes only 18 
>> > V-cycles.
>> >
>> > Best,
>> > Paul
>> >
>> >
>> >
>> > On Monday, February 27, 2023 18:17 CET, Barry Smith <[email protected]> 
>> > wrote:
>> >
>> >>
>> >> Paul,
>> >>
>> >> DM_BOUNDARY_GHOSTED would result in the extra ghost locations in the 
>> >> local vectors (obtained with DMCreateLocalVector() but they will not 
>> >> appear in the global vectors obtained with DMCreateGlobalVector(); 
>> >> perhaps this is the issue? Since they do not appear in the global vector 
>> >> they will not appear in the linear system so there will be no diagonal 
>> >> entries for you to set since those rows/columns do not exist in the 
>> >> linear system. In other words, using DM_BOUNDARY_GHOSTED is a way to 
>> >> avoid needing to put the Dirichlet values explicitly into the system 
>> >> being solved; DM_BOUNDARY_GHOSTED is generally more helpful for nonlinear 
>> >> systems than linear systems.
>> >>
>> >> Barry
>> >>
>> >> > On Feb 27, 2023, at 12:08 PM, Paul Grosse-Bley 
>> >> > <[email protected]> wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I would like to modify src/ksp/ksp/tutorials/ex45.c to implement 
>> >> > Dirichlet boundary conditions using DM_BOUNDARY_GHOSTED instead of 
>> >> > using DM_BOUNDARY_NONE and explicitly implementing the boundary by 
>> >> > adding diagnonal-only rows.
>> >> >
>> >> > My assumption was that with DM_BOUNDARY_GHOSTED all vectors from that 
>> >> > DM have the extra memory for the ghost entries and that I can basically 
>> >> > use DMDAGetGhostCorners instead of DMDAGetCorners to access the array 
>> >> > gotten via DMDAVecGetArray. But when I access (gxs, gys, gzs) = 
>> >> > (-1,-1,-1) I get a segmentation fault. When looking at the 
>> >> > implementation of DMDAVecGetArray it looked to me as if accessing (-1, 
>> >> > -1, -1) should work as DMDAVecGetArray passes the ghost corners to 
>> >> > VecGetArray3d which then adds the right offsets.
>> >> >
>> >> > I could not find any example using DM_BOUNDARY_GHOSTED and then 
>> >> > actually accessing the ghost/boundary elements. Can I assume that they 
>> >> > are set to zero for the solution vector, i.e. the u=0 on \del\Omega and 
>> >> > I do not need to access them at all?
>> >> >
>> >> > Best,
>> >> > Paul Große-Bley
>> >>
>>

Re: [petsc-users] How to use DM_BOUNDARY_GHOSTED for Dirichlet boundary conditions

Reply via email to