Thank you for the detailed answer, Barry. I had hit a deadend on my side.
If you wish to compare, for example, ex45.c with a code that does not 
incorporate the Dirichlet boundary nodes in the linear system you can just use 
0 boundary conditions for both codes.Do you mean to implement the boundary 
conditions explicitly in e.g. hpgmg-cuda instead of using the ghosted cells for 
them?

Do I go right in the assumption that the PCMG coarsening (using DMDAs geometric 
information) will cause the boundary condition on the coarser grids to be 
finite (>0)?

Ideally I would like to just use some kind of GPU-parallel (colored) 
SOR/Gauss-Seidel instead of Jacobi. One can relatively easily implement 
Red-Black GS using cuSPARSE's masked matrix vector products, but I have not 
found any information on implementing a custom preconditioner in PETSc.

Best,
Paul Grosse-Bley

On Wednesday, March 01, 2023 05:38 CET, Barry Smith <[email protected]> wrote:
      Ok, here is the situation. The command line options as given do not 
result in multigrid quality convergence in any of the runs; the error 
contraction factor is around .94 (meaning that for the modes that the multigrid 
algorithm does the worst on it only removes about 6 percent of them per 
iteration).     But this is hidden by the initial right hand side for the 
linear system as written in ex45.c which has O(h) values on the boundary nodes 
and O(h^3) values on the interior nodes. The first iterations are largely 
working on the boundary residual and making great progress attacking that so 
that it looks like the one has a good error contraction factor. One then sees 
the error contraction factor start to get worse and worse for the later 
iterations. With the 0 on the boundary the iterations quickly get to the bad 
regime where the error contraction factor is near one. One can see this by 
using a -ksp_rtol 1.e-12 and having the MG code print the residual decrease for 
each iteration. Thought it appears the 0 boundary condition one converges much 
slower (since it requires many more iterations) if you factor out the huge 
advantage of the nonzero boundary condition case at the beginning (in terms of 
decreasing the residual) you see they both have an asymptotic error contraction 
factor of around .94 (which is horrible for multigrid).    I now add 
-mg_levels_ksp_richardson_scale .9 -mg_coarse_ksp_richardson_scale .9 and rerun 
the two cases (nonzero and zero boundary right hand side) they take 35 and 41 
iterations (much better) 
initial residual norm 14.6993
next residual norm 0.84167 0.0572591
next residual norm 0.0665392 0.00452668
next residual norm 0.0307273 0.00209039
next residual norm 0.0158949 0.00108134
next residual norm 0.00825189 0.000561378
next residual norm 0.00428474 0.000291492
next residual norm 0.00222482 0.000151355
next residual norm 0.00115522 7.85898e-05
next residual norm 0.000599836 4.0807e-05
next residual norm 0.000311459 2.11887e-05
next residual norm 0.000161722 1.1002e-05
next residual norm 8.39727e-05 5.71269e-06
next residual norm 4.3602e-05 2.96626e-06
next residual norm 2.26399e-05 1.5402e-06
next residual norm 1.17556e-05 7.99735e-07
next residual norm 6.10397e-06 4.15255e-07
next residual norm 3.16943e-06 2.15617e-07
next residual norm 1.64569e-06 1.11957e-07
next residual norm 8.54511e-07 5.81326e-08
next residual norm 4.43697e-07 3.01848e-08
next residual norm 2.30385e-07 1.56732e-08
next residual norm 1.19625e-07 8.13815e-09
next residual norm 6.21143e-08 4.22566e-09
next residual norm 3.22523e-08 2.19413e-09
next residual norm 1.67467e-08 1.13928e-09
next residual norm 8.69555e-09 5.91561e-10
next residual norm 4.51508e-09 3.07162e-10
next residual norm 2.34441e-09 1.59491e-10
next residual norm 1.21731e-09 8.28143e-11
next residual norm 6.32079e-10 4.30005e-11
next residual norm 3.28201e-10 2.23276e-11
next residual norm 1.70415e-10 1.15934e-11
next residual norm 8.84865e-11 6.01976e-12
next residual norm 4.59457e-11 3.1257e-12
next residual norm 2.38569e-11 1.62299e-12
next residual norm 1.23875e-11 8.42724e-13
Linear solve converged due to CONVERGED_RTOL iterations 35
Residual norm 1.23875e-11 
initial residual norm 172.601
next residual norm 154.803 0.896887
next residual norm 66.9409 0.387837
next residual norm 34.4572 0.199636
next residual norm 17.8836 0.103612
next residual norm 9.28582 0.0537995
next residual norm 4.82161 0.027935
next residual norm 2.50358 0.014505
next residual norm 1.29996 0.0075316
next residual norm 0.674992 0.00391071
next residual norm 0.350483 0.0020306
next residual norm 0.181985 0.00105437
next residual norm 0.094494 0.000547472
next residual norm 0.0490651 0.000284269
next residual norm 0.0254766 0.000147604
next residual norm 0.0132285 7.6642e-05
next residual norm 0.00686876 3.97956e-05
next residual norm 0.00356654 2.06635e-05
next residual norm 0.00185189 1.07293e-05
next residual norm 0.000961576 5.5711e-06
next residual norm 0.000499289 2.89274e-06
next residual norm 0.000259251 1.50203e-06
next residual norm 0.000134614 7.79914e-07
next residual norm 6.98969e-05 4.04963e-07
next residual norm 3.62933e-05 2.10273e-07
next residual norm 1.88449e-05 1.09182e-07
next residual norm 9.78505e-06 5.66919e-08
next residual norm 5.0808e-06 2.94367e-08
next residual norm 2.63815e-06 1.52847e-08
next residual norm 1.36984e-06 7.93645e-09
next residual norm 7.11275e-07 4.12093e-09
next residual norm 3.69322e-07 2.13975e-09
next residual norm 1.91767e-07 1.11105e-09
next residual norm 9.95733e-08 5.769e-10
next residual norm 5.17024e-08 2.99549e-10
next residual norm 2.6846e-08 1.55538e-10
next residual norm 1.39395e-08 8.07615e-11
next residual norm 7.23798e-09 4.19348e-11
next residual norm 3.75824e-09 2.17742e-11
next residual norm 1.95138e-09 1.13058e-11
next residual norm 1.01327e-09 5.87059e-12
next residual norm 5.26184e-10 3.04856e-12
next residual norm 2.73182e-10 1.58274e-12
next residual norm 1.41806e-10 8.21586e-13
Linear solve converged due to CONVERGED_RTOL iterations 42
Residual norm 1.41806e-10 Notice in the first run the residual norm still dives 
much more quickly for the first 2 iterations than the second run. This is 
because the first run has "lucky error" that gets wiped out easily from the big 
boundary term. After that you can see that the convergence for both is very 
similar with both having a reasonable error contraction factor of .51 I' ve 
attached the modified src/ksp/pc/impls/mg/mg.c that prints the residuals along 
the way. 

Reply via email to