The scaling might be the problem, especially since I don't know what you mean 
by scaling it according to FE.

For reproducing the issue with a smaller problem:
Change the ComputeRHS function in ex45.c

if (i == 0 || j == 0 || k == 0 || i == mx - 1 || j == my - 1 || k == mz - 1) {
  barray[k][j][i] = 0.0;
} else {
  barray[k][j][i] = 1.0;
}

Change the dimensions to e.g. 33 (I scaled it down, so it goes quick without a 
GPU) instead of 7 and then run with

-ksp_converged_reason -ksp_type richardson -ksp_rtol 1e-09 -pc_type mg 
-pc_mg_levels 3 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 6 
-mg_levels_ksp_converged_maxits -mg_levels_pc_type jacobi -mg_coarse_ksp_type 
richardson -mg_coarse_ksp_max_it 6 -mg_coarse_ksp_converged_maxits 
-mg_coarse_pc_type jacobi

You will find that it takes 145 iterations instead of 25 for the original ex45 
RHS. My hpgmg-cuda implementation (using 32^3) takes 41 iterations.

To what do I have to change the diagonal entries of the matrix for the boundary 
according to FE? Right now the diagonal is completely constant.

Paul

On Tuesday, February 28, 2023 00:23 CET, Barry Smith <bsm...@petsc.dev> wrote:
 
I have not seen explicitly including, or excluding, the Dirichlet boundary 
values in the system having a significant affect on the convergence so long as 
you SCALE the diagonal rows (of those Dirichlet points) by a value similar to 
the other entries along the diagonal. If they are scaled completely 
differently, that can screw up the convergence. For src/ksp/ksp/ex45.c I see 
that the appropriate scaling is used (note the scaling should come from a 
finite element view of the discretization even if the discretization is finite 
differences as is done in ex45.c)

Are you willing to share the two codes so we can take a look with experienced 
eyes to try to figure out the difference?

Barry




> On Feb 27, 2023, at 5:48 PM, Paul Grosse-Bley 
> <paul.grosse-b...@ziti.uni-heidelberg.de> wrote:
>
> Hi Barry,
>
> the reason why I wanted to change to ghost boundaries is that I was worrying 
> about the effect of PCMGs coarsening on these boundary values.
>
> As mentioned before, I am trying to reproduce results from the hpgmg-cuda 
> benchmark (a modified version of it, e.g. using 2nd order instead of 4th 
> etc.).
> I am trying to solve the Poisson equation -\nabla^2 u = 1 with u = 0 on the 
> boundary with rtol=1e-9. While my MG solver implemented in hpgmg solves this 
> in 40 V-cycles (I weakened it a lot by only doing smooths at the coarse level 
> instead of CG). When I run the "same" MG solver built in PETSc on this 
> problem, it starts out reducing the residual norm as fast or even faster for 
> the first 20-30 iterations. But for the last order of magnitude in the 
> residual norm it needs more than 300 V-cycles, i.e. it gets very slow. At 
> this point I am pretty much out of ideas about what is the cause, especially 
> since e.g. adding back cg at the coarsest level doesn't seem to change the 
> number of iterations at all. Therefore I am suspecting the discretization to 
> be the problem. HPGMG uses an even number of points per dimension (e.g. 256), 
> while PCMG wants an odd number (e.g. 257). So I also tried adding another 
> layer of boundary values for the discretization to effectively use only 254 
> points per dimension. This caused the solver to get even slightly worse.
>
> So can the explicit boundary values screw with the coarsening, especially 
> when they are not finite? Because with the problem as stated in ex45 with 
> finite (i.e. non-zero) boundary values, the MG solver takes only 18 V-cycles.
>
> Best,
> Paul
>
>
>
> On Monday, February 27, 2023 18:17 CET, Barry Smith <bsm...@petsc.dev> wrote:
>
>>
>> Paul,
>>
>> DM_BOUNDARY_GHOSTED would result in the extra ghost locations in the local 
>> vectors (obtained with DMCreateLocalVector() but they will not appear in the 
>> global vectors obtained with DMCreateGlobalVector(); perhaps this is the 
>> issue? Since they do not appear in the global vector they will not appear in 
>> the linear system so there will be no diagonal entries for you to set since 
>> those rows/columns do not exist in the linear system. In other words, using 
>> DM_BOUNDARY_GHOSTED is a way to avoid needing to put the Dirichlet values 
>> explicitly into the system being solved; DM_BOUNDARY_GHOSTED is generally 
>> more helpful for nonlinear systems than linear systems.
>>
>> Barry
>>
>> > On Feb 27, 2023, at 12:08 PM, Paul Grosse-Bley 
>> > <paul.grosse-b...@ziti.uni-heidelberg.de> wrote:
>> >
>> > Hi,
>> >
>> > I would like to modify src/ksp/ksp/tutorials/ex45.c to implement Dirichlet 
>> > boundary conditions using DM_BOUNDARY_GHOSTED instead of using 
>> > DM_BOUNDARY_NONE and explicitly implementing the boundary by adding 
>> > diagnonal-only rows.
>> >
>> > My assumption was that with DM_BOUNDARY_GHOSTED all vectors from that DM 
>> > have the extra memory for the ghost entries and that I can basically use 
>> > DMDAGetGhostCorners instead of DMDAGetCorners to access the array gotten 
>> > via DMDAVecGetArray. But when I access (gxs, gys, gzs) = (-1,-1,-1) I get 
>> > a segmentation fault. When looking at the implementation of 
>> > DMDAVecGetArray it looked to me as if accessing (-1, -1, -1) should work 
>> > as DMDAVecGetArray passes the ghost corners to VecGetArray3d which then 
>> > adds the right offsets.
>> >
>> > I could not find any example using DM_BOUNDARY_GHOSTED and then actually 
>> > accessing the ghost/boundary elements. Can I assume that they are set to 
>> > zero for the solution vector, i.e. the u=0 on \del\Omega and I do not need 
>> > to access them at all?
>> >
>> > Best,
>> > Paul Große-Bley
>>
 

Reply via email to