Generically you see the ~[DOF]^3 for dense matrix factorizations. For sparse, depending on the problem and space dimension 1, 2, or 3 you do much better than ~[DOF]^3 dof. Iterative solvers when working well offer the possibility of ~[DOF] which is why they are needed for very large problems.
> On Mar 29, 2024, at 3:29 PM, Zou, Ling via petsc-users > <[email protected]> wrote: > > Note that [Wall Time] ~ [DOF]^1.333, instead of being ~[DOF]^3. > The [DOF]^3 rule was the scary part that I wanted to avoid LU. > > -Ling > > From: petsc-users <[email protected] > <mailto:[email protected]>> on behalf of Zou, Ling via > petsc-users <[email protected] <mailto:[email protected]>> > Date: Friday, March 29, 2024 at 2:06 PM > To: Barry Smith <[email protected] <mailto:[email protected]>>, Zhang, Hong > <[email protected] <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > Hong, are these results somewhat expected? I don’t see any speed up for using > 2 processors (maybe I don’t have 2 processors?). > > Option > Wall Time (sec) > -pc_type lu > 7.442 > mpiexec -n 2 -pc_type lu > 9.112 > -pc_type lu -pc_factor_mat_solver_type mumps > 8.748 > mpiexec -n 2 -pc_type lu -pc_factor_mat_solver_type mumps > 9.013 > > For different size problems > -pc_type lu -m 1000 -n 1000 > 7.442 > -pc_type lu -m 750 -n 750 > 3.142 > -pc_type lu -m 500 -n 500 > 1.007 > -pc_type lu -m 250 -n 250 > 0.150 > -pc_type lu -m 100 -n 100 > 0.016 > > <image001.png> > > > > From: petsc-users <[email protected] > <mailto:[email protected]>> on behalf of Zou, Ling via > petsc-users <[email protected] <mailto:[email protected]>> > Date: Friday, March 29, 2024 at 12:50 PM > To: Barry Smith <[email protected] <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > I cannot believe that I typed: make ex02 > Thanks, it works. > > -Ling > > From: Barry Smith <[email protected] <mailto:[email protected]>> > Date: Friday, March 29, 2024 at 12:43 PM > To: Zou, Ling <[email protected] <mailto:[email protected]>> > Cc: Zhang, Hong <[email protected] <mailto:[email protected]>>, > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > cd src/ksp/ksp/tutorials make ex2 On Mar 29, 2024, at 1: 10 PM, Zou, Ling > <lzou@ anl. gov> wrote: Hong, thanks! That’s great to know. I’d like to try > the ex2 tutorial case locally to see how it performs. I have already > installed PETSc 3. 20. 5 > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > cd src/ksp/ksp/tutorials > make ex2 > > > > On Mar 29, 2024, at 1:10 PM, Zou, Ling <[email protected] <mailto:[email protected]>> > wrote: > > Hong, thanks! That’s great to know. > I’d like to try the ex2 tutorial case locally to see how it performs. I have > already installed PETSc 3.20.5 on my Mac. > Here shows the very last step of installation. > > make PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 PETSC_ARCH=arch-opt check > Running PETSc check examples to verify correct installation > Using PETSC_DIR=/Users/lingzou/Downloads/petsc-3.20.5 and PETSC_ARCH=arch-opt > C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > C/C++ example src/snes/tutorials/ex19 run successfully with 2 MPI processes > Completed PETSc check examples > > I found myself not knowing how to compile petsc/src/ksp/ksp/tutorials/ex2.c > Do we have a page for how to do that? > > Best, > > -Ling > > From: Zhang, Hong <[email protected] <mailto:[email protected]>> > Date: Thursday, March 28, 2024 at 4:59 PM > To: Zou, Ling <[email protected] <mailto:[email protected]>>, Barry Smith > <[email protected] <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > Ling, > MUMPS > https://urldefense.us/v3/__https://mumps-solver.org/index.php__;!!G_uCfscf7eWS!ZmTlsQateB-nACNJAmqiJGcDWxWQOps2BeB7_vEs7q7-Rr8Do1invh3ez12a6aaIkSB7-jziREAovRpWXE73gS4$ > > <https://urldefense.us/v3/__https:/mumps-solver.org/index.php__;!!G_uCfscf7eWS!b4SLVXTUaKyR1_NPGNEtGinrk2pTkW9odwoiYKcTjslyDUQxuhihIs1ZLqrh2z33R3C5VLIwl86Bvw$> > , superlu and superlu_dist > https://urldefense.us/v3/__https://portal.nersc.gov/project/sparse/superlu/__;!!G_uCfscf7eWS!ZmTlsQateB-nACNJAmqiJGcDWxWQOps2BeB7_vEs7q7-Rr8Do1invh3ez12a6aaIkSB7-jziREAovRpWWnkf2IM$ > > <https://urldefense.us/v3/__https:/portal.nersc.gov/project/sparse/superlu/__;!!G_uCfscf7eWS!b4SLVXTUaKyR1_NPGNEtGinrk2pTkW9odwoiYKcTjslyDUQxuhihIs1ZLqrh2z33R3C5VLIHcCP4HQ$> > are sparse LU solvers, i.e., they produce SPARSE LU matrix factors. For many > applications, they can solve 1 million DOF easily even in sequential mode. > For example > petsc/src/ksp/ksp/tutorials > ./ex2 -pc_type lu -pc_factor_mat_solver_type mumps -m 1000 -n 1000 > -ksp_monitor_true_residual > 0 KSP preconditioned resid norm 1.000000000000e+03 true resid norm > 6.330876716538e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 9.976801056860e-09 true resid norm > 3.908107755078e-10 ||r(i)||/||b|| 6.173090916254e-12 > Norm of error 9.98582e-09 iterations 1 > > MUMPS LU solves this matrix of size 1.e6 in one iteration (takes few sec on > my laptop). > As Barry suggests, try mumps first. If it fails or it is too slow, then > explore other solvers available in PETSc > https://urldefense.us/v3/__https://petsc.org/release/overview/linear_solve_table/__;!!G_uCfscf7eWS!ZmTlsQateB-nACNJAmqiJGcDWxWQOps2BeB7_vEs7q7-Rr8Do1invh3ez12a6aaIkSB7-jziREAovRpWUfuSIRU$ > > <https://urldefense.us/v3/__https:/petsc.org/release/overview/linear_solve_table/__;!!G_uCfscf7eWS!b4SLVXTUaKyR1_NPGNEtGinrk2pTkW9odwoiYKcTjslyDUQxuhihIs1ZLqrh2z33R3C5VLKPDawFmw$> > > From my experiments, MUMPS is faster and more robust than > superlu/superlu_dist, yet it consumes slightly more memory. > See > https://urldefense.us/v3/__https://petsc.org/release/manual/ksp/*using-external-linear-solvers__;Iw!!G_uCfscf7eWS!ZmTlsQateB-nACNJAmqiJGcDWxWQOps2BeB7_vEs7q7-Rr8Do1invh3ez12a6aaIkSB7-jziREAovRpWq_RIWZM$ > > <https://urldefense.us/v3/__https:/petsc.org/release/manual/ksp/*using-external-linear-solvers__;Iw!!G_uCfscf7eWS!b4SLVXTUaKyR1_NPGNEtGinrk2pTkW9odwoiYKcTjslyDUQxuhihIs1ZLqrh2z33R3C5VLJJEKMXKA$> > on how to install mumps with petsc. > > Hong > > > > > > > From: Zou, Ling <[email protected] <mailto:[email protected]>> > Sent: Thursday, March 28, 2024 2:34 PM > To: Barry Smith <[email protected] <mailto:[email protected]>> > Cc: Zhang, Hong <[email protected] <mailto:[email protected]>>; > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > Thank you. Those are great suggestions. Although I mentioned 1 million DOF, > but we rarely go there, so maybe stick with what is working now, and > meanwhile seeking helps from literatures. > > -Ling > > From: Barry Smith <[email protected] <mailto:[email protected]>> > Date: Thursday, March 28, 2024 at 2:26 PM > To: Zou, Ling <[email protected] <mailto:[email protected]>> > Cc: Zhang, Hong <[email protected] <mailto:[email protected]>>, > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > You may benefit from a literature search on your model AND preconditioners to > see what others have used. But I would try PETSc/MUMPS on the biggest size > you want and see how it goes (better it runs for a little longer and you > don't waste months > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > You may benefit from a literature search on your model AND preconditioners > to see what others have used. But I would try PETSc/MUMPS on the biggest size > you want and see how it goes (better it runs for a little longer and you > don't waste months trying to find a good preconditioner). > > > > > > On Mar 28, 2024, at 2:20 PM, Zou, Ling <[email protected] <mailto:[email protected]>> > wrote: > > Thank you, Barry. > Yes, I have tried different preconditioners, but in a naïve way, i.e., > looping through possible options using `-pc_type <option>` command line. > But no, not in a meaningful way because the lack of understanding of the > connection between physics (the problem we are dealing with) to math (the > correct combination of those preconditioners). > > -Ling > > From: Barry Smith <[email protected] <mailto:[email protected]>> > Date: Thursday, March 28, 2024 at 1:09 PM > To: Zou, Ling <[email protected] <mailto:[email protected]>> > Cc: Zhang, Hong <[email protected] <mailto:[email protected]>>, > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > 1 million is possible for direct solvers using PETSc with the MUMPS direct > solver when you cannot get a preconditioner to work well for your problems. > ILU are not very robust preconditioners and I would not rely on them. Have > you investigated > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > 1 million is possible for direct solvers using PETSc with the MUMPS direct > solver when you cannot get a preconditioner to work well for your problems. > > ILU are not very robust preconditioners and I would not rely on them. > Have you investigated other preconditioners in PETSc, PCGAMG, PCASM, > PCFIELDSPLIT or some combination of these preconditioners work for many > problems, though certainly not all. > > > > On Mar 28, 2024, at 1:14 PM, Zou, Ling <[email protected] <mailto:[email protected]>> > wrote: > > Thank you, Barry. > Yeah, this is unfortunate given that the problem we are handling is quite > heterogeneous (in both mesh and physics). > I expect that our problem sizes will be mostly smaller than 1 million DOF, > should LU still be a practical solution? Can it scale well if we choose to > run the problem in a parallel way? > > PS1: -ksp_norm_type unpreconditioned did not work as the true residual did > not go down, even with 300 linear iterations. > PS2: what do you think if it will be beneficial to have more detailed > discussions (e.g., a presentation?) on the problem we are solving to seek > more advice? > > -Ling > > From: Barry Smith <[email protected] <mailto:[email protected]>> > Date: Thursday, March 28, 2024 at 11:14 AM > To: Zou, Ling <[email protected] <mailto:[email protected]>> > Cc: Zhang, Hong <[email protected] <mailto:[email protected]>>, > [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Does ILU(15) still make sense or should just use > LU? > > This is a bad situation, the solver is not really converging. This can happen > with ILU() sometimes, it so badly scales things that the preconditioned > residual decreases a lot but the true residual is not really getting smaller. > Since your matrices > ZjQcmQRYFpfptBannerStart > This Message Is From an External Sender > This message came from outside your organization. > > ZjQcmQRYFpfptBannerEnd > > This is a bad situation, the solver is not really converging. This can > happen with ILU() sometimes, it so badly scales things that the > preconditioned residual decreases a lot but the true residual is not really > getting smaller. Since your matrices are small best to stick to LU. > > You can use -ksp_norm_type unpreconditioned to force the convergence test > to use the true residual for a convergence test and the solver will discover > that it is not converging. > > Barry > > > > On Mar 28, 2024, at 11:43 AM, Zou, Ling via petsc-users > <[email protected] <mailto:[email protected]>> wrote: > > Hong, thanks! That makes perfect sense. > A follow up question about ILU. > > The following is the performance of ILU(5). Note that each KPS solving > reports converged but as the output shows, the preconditioned residual does > while true residual does not. Is there any way this performance could be > improved? > Background: the preconditioning matrix is finite difference generated, and > should be exact. > > -Ling > > Time Step 21, time = -491.75, dt = 1 > NL Step = 0, fnorm = 6.98749E+01 > 0 KSP preconditioned resid norm 1.684131526824e+04 true resid norm > 6.987489798042e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 5.970568556551e+02 true resid norm > 6.459553545222e+01 ||r(i)||/||b|| 9.244455064582e-01 > 2 KSP preconditioned resid norm 3.349113985192e+02 true resid norm > 7.250836872274e+01 ||r(i)||/||b|| 1.037688366186e+00 > 3 KSP preconditioned resid norm 3.290585904777e+01 true resid norm > 1.186282435163e+02 ||r(i)||/||b|| 1.697723316169e+00 > 4 KSP preconditioned resid norm 8.530606201233e+00 true resid norm > 4.088729421459e+01 ||r(i)||/||b|| 5.851499665310e-01 > Linear solve converged due to CONVERGED_RTOL iterations 4 > NL Step = 1, fnorm = 4.08788E+01 > 0 KSP preconditioned resid norm 1.851047973094e+03 true resid norm > 4.087882723223e+01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.696809614513e+01 true resid norm > 2.720016413105e+01 ||r(i)||/||b|| 6.653851387793e-01 > 2 KSP preconditioned resid norm 5.751891392534e+00 true resid norm > 3.326338240872e+01 ||r(i)||/||b|| 8.137068663873e-01 > 3 KSP preconditioned resid norm 8.540729397958e-01 true resid norm > 8.672410748720e+00 ||r(i)||/||b|| 2.121492062249e-01 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 2, fnorm = 8.67124E+00 > 0 KSP preconditioned resid norm 5.511333966852e+00 true resid norm > 8.671237519593e+00 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.174962622023e+00 true resid norm > 8.731034658309e+00 ||r(i)||/||b|| 1.006896032842e+00 > 2 KSP preconditioned resid norm 1.104604471016e+00 true resid norm > 1.018397505468e+01 ||r(i)||/||b|| 1.174454630227e+00 > 3 KSP preconditioned resid norm 4.257063674222e-01 true resid norm > 4.023093124996e+00 ||r(i)||/||b|| 4.639583584126e-01 > 4 KSP preconditioned resid norm 1.023038868263e-01 true resid norm > 2.365298462869e+00 ||r(i)||/||b|| 2.727751901068e-01 > 5 KSP preconditioned resid norm 4.073772638935e-02 true resid norm > 2.302623112025e+00 ||r(i)||/||b|| 2.655472309255e-01 > 6 KSP preconditioned resid norm 1.510323179379e-02 true resid norm > 2.300216593521e+00 ||r(i)||/||b|| 2.652697020839e-01 > 7 KSP preconditioned resid norm 1.337324816903e-02 true resid norm > 2.300057733345e+00 ||r(i)||/||b|| 2.652513817259e-01 > 8 KSP preconditioned resid norm 1.247384902656e-02 true resid norm > 2.300456226062e+00 ||r(i)||/||b|| 2.652973374174e-01 > 9 KSP preconditioned resid norm 1.247038855375e-02 true resid norm > 2.300532560993e+00 ||r(i)||/||b|| 2.653061406512e-01 > 10 KSP preconditioned resid norm 1.244611343317e-02 true resid norm > 2.299441241514e+00 ||r(i)||/||b|| 2.651802855496e-01 > 11 KSP preconditioned resid norm 1.227243209527e-02 true resid norm > 2.273668115236e+00 ||r(i)||/||b|| 2.622080308720e-01 > 12 KSP preconditioned resid norm 1.172621459354e-02 true resid norm > 2.113927895437e+00 ||r(i)||/||b|| 2.437861828442e-01 > 13 KSP preconditioned resid norm 2.880752338189e-03 true resid norm > 1.076190247720e-01 ||r(i)||/||b|| 1.241103412620e-02 > Linear solve converged due to CONVERGED_RTOL iterations 13 > NL Step = 3, fnorm = 1.59729E-01 > 0 KSP preconditioned resid norm 1.676948440854e+03 true resid norm > 1.597288981238e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.266131510513e+00 true resid norm > 1.819663943811e+00 ||r(i)||/||b|| 1.139220244542e+01 > 2 KSP preconditioned resid norm 2.239911493901e+00 true resid norm > 1.923976907755e+00 ||r(i)||/||b|| 1.204526501062e+01 > 3 KSP preconditioned resid norm 1.446859034276e-01 true resid norm > 8.692945031946e-01 ||r(i)||/||b|| 5.442312026225e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > NL Step = 4, fnorm = 1.59564E-01 > 0 KSP preconditioned resid norm 1.509663716414e+03 true resid norm > 1.595641817504e-01 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.995956587709e+00 true resid norm > 1.712323298361e+00 ||r(i)||/||b|| 1.073125108390e+01 > 2 KSP preconditioned resid norm 1.994336275847e+00 true resid norm > 1.741263472491e+00 ||r(i)||/||b|| 1.091262119975e+01 > 3 KSP preconditioned resid norm 1.268035008497e-01 true resid norm > 8.197057317360e-01 ||r(i)||/||b|| 5.137153731769e+00 > Linear solve converged due to CONVERGED_RTOL iterations 3 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 4 > Solve Did NOT Converge! > > > > From: Zhang, Hong <[email protected] <mailto:[email protected]>> > Date: Wednesday, March 27, 2024 at 4:59 PM > To: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>>, Zou, Ling > <[email protected] <mailto:[email protected]>> > Subject: Re: Does ILU(15) still make sense or should just use LU? > > Ling, > ILU(level) is used for saving storage space with more computations. Normally, > we use level=1 or 2. It does not make sense to use level 15. If you have > sufficient space, LU would be the best. > Hong > > From: petsc-users <[email protected] > <mailto:[email protected]>> on behalf of Zou, Ling via > petsc-users <[email protected] <mailto:[email protected]>> > Sent: Wednesday, March 27, 2024 4:24 PM > To: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: [petsc-users] Does ILU(15) still make sense or should just use LU? > > Hi, I’d like to avoid using LU, but in some cases to use ILU and still > converge, I have to go to ILU(15), i.e., `-pc_factor_levels 15`. Does it > still make sense, or should I give it up and switch to LU? > > > > For this particular case, ~2k DoF, and both ILU(15) and LU perform similarly > in terms of wall time. > > > > -Ling >
