I'm still relatively new to PETSc. I was using DMDASetUniformCoordinates and DMGetBoundingBox together. In hindsight, it was a very unnecessary thing to do. I think the simplest way to prevent anyone else from making the same mistake is to add a caveat to the DMGetBoundingBox documentation page.
On Thu, May 12, 2022 at 2:02 PM Matthew Knepley <[email protected]> wrote: > On Thu, May 12, 2022 at 1:03 PM Takahashi, Tadanaga <[email protected]> wrote: > >> Thank you for the feedback. We figured out what was causing the issue. We >> were using DMGetBoundingBox >> <https://petsc.org/main/docs/manualpages/DM/DMGetBoundingBox/> in order >> to get the limits of the global domain, but gmin and gmax contained limits >> for the local subdomains when we ran the code with NASM. Hence, our local >> coordinates xi and yj were completely wrong. The documentation states >> that DMGetBoundingBox gets the global limits. I believe this is a mistake. >> > > I think I can explain this, and maybe you can tell us how to improve the > documentation. > > I believe we make a new DM that comprises only the subdomain. Then the > bounding box for this subdomain will only contain itself, not the original > domain. > Where should we say this? > > Thanks, > > Matt > > >> This is our new output: >> $ mpiexec -n 4 ./test1 -t1_N 20 -snes_max_it 50 -snes_monitor -snes_view >> -da_overlap 3 -snes_type nasm -snes_nasm_type restrict >> 0 SNES Function norm 7.244681057908e+02 >> 1 SNES Function norm 4.394913250889e+01 >> 2 SNES Function norm 1.823326663029e+01 >> 3 SNES Function norm 7.033938512358e+00 >> 4 SNES Function norm 2.797351504285e+00 >> 5 SNES Function norm 1.130613777736e+00 >> 6 SNES Function norm 4.605418417192e-01 >> 7 SNES Function norm 1.882307001920e-01 >> 8 SNES Function norm 7.704148683921e-02 >> 9 SNES Function norm 3.155090858782e-02 >> 10 SNES Function norm 1.292418188473e-02 >> 11 SNES Function norm 5.294645671797e-03 >> 12 SNES Function norm 2.169143207557e-03 >> 13 SNES Function norm 8.886826738192e-04 >> 14 SNES Function norm 3.640894847145e-04 >> 15 SNES Function norm 1.491663153414e-04 >> 16 SNES Function norm 6.111303899450e-05 >> 17 SNES Function norm 2.503785968501e-05 >> 18 SNES Function norm 1.025795062417e-05 >> 19 SNES Function norm 4.202657921479e-06 >> SNES Object: 4 MPI processes >> type: nasm >> total subdomain blocks = 4 >> Local solver information for first block on rank 0: >> Use -snes_view ::ascii_info_detail to display information for all >> blocks >> SNES Object: (sub_) 1 MPI processes >> type: newtonls >> maximum iterations=50, maximum function evaluations=10000 >> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >> total number of linear solver iterations=2 >> total number of function evaluations=3 >> norm schedule ALWAYS >> Jacobian is built using a DMDA local Jacobian >> SNESLineSearch Object: (sub_) 1 MPI processes >> type: bt >> interpolation: cubic >> alpha=1.000000e-04 >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: (sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using NONE norm type for convergence test >> PC Object: (sub_) 1 MPI processes >> type: lu >> out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 2.13732 >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=169, cols=169 >> package used to perform factorization: petsc >> total: nonzeros=13339, allocated nonzeros=13339 >> using I-node routines: found 104 nodes, limit used is 5 >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=169, cols=169 >> total: nonzeros=6241, allocated nonzeros=6241 >> total number of mallocs used during MatSetValues calls=0 >> not using I-node routines >> maximum iterations=50, maximum function evaluations=10000 >> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >> total number of function evaluations=20 >> norm schedule ALWAYS >> Jacobian is built using a DMDA local Jacobian >> problem ex10 on 20 x 20 point 2D grid with d = 3, and eps = 0.082: >> error |u-uexact|_inf = 2.879e-02, |u-uexact|_h = 1.707e-02 >> >> On Thu, May 12, 2022 at 9:37 AM Matthew Knepley <[email protected]> >> wrote: >> >>> Your subdomain solves do not appear to be producing descent whatsoever. >>> Possible reasons: >>> >>> 1) Your subdomain Jacobians are wrong (this is usually the problem) >>> >>> 2) You have some global coupling field for which local solves give no >>> descent. (For this you want nonlinear elimination I think) >>> >>> Thanks, >>> >>> Matt >>> >>> On Thu, May 12, 2022 at 9:02 AM Takahashi, Tadanaga <[email protected]> >>> wrote: >>> >>>> I ran the code with the additional options but the raw output is about >>>> 75,000 lines. I cannot paste it directly in the email. The output is in the >>>> attached file. >>>> >>>> On Wed, May 11, 2022 at 11:44 PM Jed Brown <[email protected]> wrote: >>>> >>>>> Can you add -snes_linesearch_monitor -sub_snes_linesearch_monitor >>>>> -ksp_converged_reason and send the output?? >>>>> >>>>> "Takahashi, Tadanaga" <[email protected]> writes: >>>>> >>>>> > Hello, >>>>> > >>>>> > We are working on a finite difference solver for a 2D nonlinear PDE >>>>> with >>>>> > Dirichlet Boundary conditions on a rectangular domain. Our goal is >>>>> to solve >>>>> > the problem with parallel nonlinear additive Schwarz (NASM) as the >>>>> outer >>>>> > solver. Our code is similar to SNES example 5 >>>>> > <https://petsc.org/release/src/snes/tutorials/ex5.c.html>. In >>>>> example 5, >>>>> > the parallel NASM can be executed with a command like `mpiexec -n 4 >>>>> ./ex5 >>>>> > -mms 3 -snes_type nasm -snes_nasm_type restrict -da_overlap 2` which >>>>> gives >>>>> > a convergent result. We assume this is the correct usage. A comment >>>>> in the >>>>> > source code for NASM mentions that NASM should be a preconditioner >>>>> but >>>>> > there's no documentation on the usage. The Brune paper does not cover >>>>> > parallel NASM either. We observed that increasing the overlap leads >>>>> to >>>>> > fewer Schwarz iterations. The parallelization works seamlessly for an >>>>> > arbitrary number of subdomains. This is the type of behavior we were >>>>> > expecting from our code. >>>>> > >>>>> > Our method uses box-style stencil width d = ceil(N^(1/3)) on a N by >>>>> N DMDA. >>>>> > The finite difference stencil consists of 4d+1 points spread out in a >>>>> > diamond formation. If a stencil point is out of bounds, then it is >>>>> > projected onto the boundary curve. Since the nodes on the boundary >>>>> curve >>>>> > would result in an irregular mesh, we chose not treat boundary nodes >>>>> as >>>>> > unknowns as in Example 5. We use DMDACreate2d to create the DA for >>>>> the >>>>> > interior points and DMDASNESSetFunctionLocal to associate the residue >>>>> > function to the SNES object. >>>>> > >>>>> > Our code works serially. We have also tested our code >>>>> > with Newton-Krylov-Schwarz (NKS) by running something akin to >>>>> `mpiexec -n >>>>> > <n> ./solve -snes_type newtonls`. We have tested the NKS for several >>>>> > quantities of subdomains and overlap and the code works as expected. >>>>> We >>>>> > have some confidence in the correctness of our code. The overlapping >>>>> NASM >>>>> > was implemented in MATLAB so we know the method converges. However, >>>>> the >>>>> > parallel NASM will not converge with our PETSc code. We don't >>>>> understand >>>>> > why NKS works while NASM does not. The F-norm residue monotonically >>>>> > decreases and then stagnates. >>>>> > >>>>> > Here is an example of the output when attempting to run NASM in >>>>> parallel: >>>>> > takahashi@ubuntu:~/Desktop/MA-DDM/Cpp/Rectangle$ mpiexec -n 4 >>>>> ./test1 -t1_N >>>>> > 20 -snes_max_it 50 -snes_monitor -snes_view -da_overlap 3 -snes_type >>>>> nasm >>>>> > -snes_nasm_type restrict >>>>> > 0 SNES Function norm 7.244681057908e+02 >>>>> > 1 SNES Function norm 1.237688062971e+02 >>>>> > 2 SNES Function norm 1.068926073552e+02 >>>>> > 3 SNES Function norm 1.027563237834e+02 >>>>> > 4 SNES Function norm 1.022184806736e+02 >>>>> > 5 SNES Function norm 1.020818227640e+02 >>>>> > 6 SNES Function norm 1.020325629121e+02 >>>>> > 7 SNES Function norm 1.020149036595e+02 >>>>> > 8 SNES Function norm 1.020088110545e+02 >>>>> > 9 SNES Function norm 1.020067198030e+02 >>>>> > 10 SNES Function norm 1.020060034469e+02 >>>>> > 11 SNES Function norm 1.020057582380e+02 >>>>> > 12 SNES Function norm 1.020056743241e+02 >>>>> > 13 SNES Function norm 1.020056456101e+02 >>>>> > 14 SNES Function norm 1.020056357849e+02 >>>>> > 15 SNES Function norm 1.020056324231e+02 >>>>> > 16 SNES Function norm 1.020056312727e+02 >>>>> > 17 SNES Function norm 1.020056308791e+02 >>>>> > 18 SNES Function norm 1.020056307444e+02 >>>>> > 19 SNES Function norm 1.020056306983e+02 >>>>> > 20 SNES Function norm 1.020056306826e+02 >>>>> > 21 SNES Function norm 1.020056306772e+02 >>>>> > 22 SNES Function norm 1.020056306753e+02 >>>>> > 23 SNES Function norm 1.020056306747e+02 >>>>> > 24 SNES Function norm 1.020056306745e+02 >>>>> > 25 SNES Function norm 1.020056306744e+02 >>>>> > 26 SNES Function norm 1.020056306744e+02 >>>>> > 27 SNES Function norm 1.020056306744e+02 >>>>> > 28 SNES Function norm 1.020056306744e+02 >>>>> > 29 SNES Function norm 1.020056306744e+02 >>>>> > 30 SNES Function norm 1.020056306744e+02 >>>>> > 31 SNES Function norm 1.020056306744e+02 >>>>> > 32 SNES Function norm 1.020056306744e+02 >>>>> > 33 SNES Function norm 1.020056306744e+02 >>>>> > 34 SNES Function norm 1.020056306744e+02 >>>>> > 35 SNES Function norm 1.020056306744e+02 >>>>> > 36 SNES Function norm 1.020056306744e+02 >>>>> > 37 SNES Function norm 1.020056306744e+02 >>>>> > 38 SNES Function norm 1.020056306744e+02 >>>>> > 39 SNES Function norm 1.020056306744e+02 >>>>> > 40 SNES Function norm 1.020056306744e+02 >>>>> > 41 SNES Function norm 1.020056306744e+02 >>>>> > 42 SNES Function norm 1.020056306744e+02 >>>>> > 43 SNES Function norm 1.020056306744e+02 >>>>> > 44 SNES Function norm 1.020056306744e+02 >>>>> > 45 SNES Function norm 1.020056306744e+02 >>>>> > 46 SNES Function norm 1.020056306744e+02 >>>>> > 47 SNES Function norm 1.020056306744e+02 >>>>> > 48 SNES Function norm 1.020056306744e+02 >>>>> > 49 SNES Function norm 1.020056306744e+02 >>>>> > 50 SNES Function norm 1.020056306744e+02 >>>>> > SNES Object: 4 MPI processes >>>>> > type: nasm >>>>> > total subdomain blocks = 4 >>>>> > Local solver information for first block on rank 0: >>>>> > Use -snes_view ::ascii_info_detail to display information for >>>>> all blocks >>>>> > SNES Object: (sub_) 1 MPI processes >>>>> > type: newtonls >>>>> > maximum iterations=50, maximum function evaluations=10000 >>>>> > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >>>>> > total number of linear solver iterations=22 >>>>> > total number of function evaluations=40 >>>>> > norm schedule ALWAYS >>>>> > Jacobian is built using a DMDA local Jacobian >>>>> > SNESLineSearch Object: (sub_) 1 MPI processes >>>>> > type: bt >>>>> > interpolation: cubic >>>>> > alpha=1.000000e-04 >>>>> > maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>> > tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>> > lambda=1.000000e-08 >>>>> > maximum iterations=40 >>>>> > KSP Object: (sub_) 1 MPI processes >>>>> > type: preonly >>>>> > maximum iterations=10000, initial guess is zero >>>>> > tolerances: relative=1e-05, absolute=1e-50, >>>>> divergence=10000. >>>>> > left preconditioning >>>>> > using NONE norm type for convergence test >>>>> > PC Object: (sub_) 1 MPI processes >>>>> > type: lu >>>>> > out-of-place factorization >>>>> > tolerance for zero pivot 2.22045e-14 >>>>> > matrix ordering: nd >>>>> > factor fill ratio given 5., needed 2.13732 >>>>> > Factored matrix follows: >>>>> > Mat Object: 1 MPI processes >>>>> > type: seqaij >>>>> > rows=169, cols=169 >>>>> > package used to perform factorization: petsc >>>>> > total: nonzeros=13339, allocated nonzeros=13339 >>>>> > using I-node routines: found 104 nodes, limit used >>>>> is 5 >>>>> > linear system matrix = precond matrix: >>>>> > Mat Object: 1 MPI processes >>>>> > type: seqaij >>>>> > rows=169, cols=169 >>>>> > total: nonzeros=6241, allocated nonzeros=6241 >>>>> > total number of mallocs used during MatSetValues calls=0 >>>>> > not using I-node routines >>>>> > maximum iterations=50, maximum function evaluations=10000 >>>>> > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >>>>> > total number of function evaluations=51 >>>>> > norm schedule ALWAYS >>>>> > Jacobian is built using a DMDA local Jacobian >>>>> > problem ex10 on 20 x 20 point 2D grid with d = 3, and eps = 0.082: >>>>> > error |u-uexact|_inf = 3.996e-01, |u-uexact|_h = 2.837e-01 >>>>> > >>>>> > We have been stuck on this for a while now. We do not know how to >>>>> debug >>>>> > this issue. Please let us know if you have any insights. >>>>> >>>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> https://www.cse.buffalo.edu/~knepley/ >>> <http://www.cse.buffalo.edu/~knepley/> >>> >> > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > <http://www.cse.buffalo.edu/~knepley/> >
