are there any safe subsets of -march=whatever? i had it on to take advantage of simd ops on avx512 chips but never looked so close at the exact results.
On Fri, May 5, 2023 at 4:58 PM Barry Smith <[email protected]> wrote: > > > On May 5, 2023, at 4:45 PM, Mark Lohry <[email protected]> wrote: > > wow. leaving -O3 and turning off -march=native seems to have made it > repeatable. this is on an avx2 cpu if it matters. > > out-of-order instructions may be performed thus, two runs may have >> different order of operations >> >> > this is terrifying if true. the source code path is exactly the same every > time but the cpu does different things? > > > Sure. And you will see more of it in the future, not less. It is not so > much the CPU does different things each time but that the same things > happen in a different order (and different order for floating point > arithmetic means different results). > > > On Fri, May 5, 2023 at 10:55 AM Barry Smith <[email protected]> wrote: > >> >> Mark, >> >> Thank you. You do have aggressive optimizations: -O3 -march=native, >> which means out-of-order instructions may be performed thus, two runs may >> have different order of operations and possibly different round-off values. >> >> You could try turning off all of this with -O0 for an experiment and >> see what happens. My guess is that you will see much smaller differences in >> the residuals. >> >> Barry >> >> >> On May 5, 2023, at 8:11 AM, Mark Lohry <[email protected]> wrote: >> >> >> >> On Thu, May 4, 2023 at 9:51 PM Barry Smith <[email protected]> wrote: >> >>> >>> Send configure.log >>> >>> >>> On May 4, 2023, at 5:35 PM, Mark Lohry <[email protected]> wrote: >>> >>> Sure, but why only once and why save to disk? Why not just use that >>>> computed approximate Jacobian at each Newton step to drive the Newton >>>> solves along for a bunch of time steps? >>> >>> >>> Ah I get what you mean. Okay I did three newton steps with the same LHS, >>> with a few repeated manual tests. 3 out of 4 times i got the same exact >>> history. is it in the realm of possibility that a hardware error could >>> cause something this subtle, bad memory bit or something? >>> >>> 2 runs of 3 newton solves below, ever-so-slightly different. >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994246e+04 >>> 3 KSP Residual norm 2.104615835161e+04 >>> 4 KSP Residual norm 1.938102896632e+04 >>> 5 KSP Residual norm 1.793774642408e+04 >>> 6 KSP Residual norm 1.671392566980e+04 >>> 7 KSP Residual norm 1.501504103873e+04 >>> 8 KSP Residual norm 1.366362900747e+04 >>> 9 KSP Residual norm 1.240398500429e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477958e+04 >>> 12 KSP Residual norm 9.835601966950e+03 >>> 13 KSP Residual norm 9.017480191491e+03 >>> 14 KSP Residual norm 8.415336139780e+03 >>> 15 KSP Residual norm 7.807497808435e+03 >>> 16 KSP Residual norm 7.341703768294e+03 >>> 17 KSP Residual norm 6.979298049282e+03 >>> 18 KSP Residual norm 6.521277772081e+03 >>> 19 KSP Residual norm 6.174842408773e+03 >>> 20 KSP Residual norm 5.889819665003e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348433e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348433e+04 >>> 0 KSP Residual norm 1.000525348433e+04 >>> 1 KSP Residual norm 7.908741564765e+03 >>> 2 KSP Residual norm 6.825263536686e+03 >>> 3 KSP Residual norm 6.224930664968e+03 >>> 4 KSP Residual norm 6.095547180532e+03 >>> 5 KSP Residual norm 5.952968230430e+03 >>> 6 KSP Residual norm 5.861251998116e+03 >>> 7 KSP Residual norm 5.712439327755e+03 >>> 8 KSP Residual norm 5.583056913266e+03 >>> 9 KSP Residual norm 5.461768804626e+03 >>> 10 KSP Residual norm 5.351937611098e+03 >>> 11 KSP Residual norm 5.224288337578e+03 >>> 12 KSP Residual norm 5.129863847081e+03 >>> 13 KSP Residual norm 5.010818237218e+03 >>> 14 KSP Residual norm 4.907162936199e+03 >>> 15 KSP Residual norm 4.789564773955e+03 >>> 16 KSP Residual norm 4.695173370720e+03 >>> 17 KSP Residual norm 4.584070962171e+03 >>> 18 KSP Residual norm 4.483061424742e+03 >>> 19 KSP Residual norm 4.373384070745e+03 >>> 20 KSP Residual norm 4.260704657592e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014882e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014882e+03 >>> 0 KSP Residual norm 4.662386014882e+03 >>> 1 KSP Residual norm 4.408316259864e+03 >>> 2 KSP Residual norm 4.184867769829e+03 >>> 3 KSP Residual norm 4.079091244351e+03 >>> 4 KSP Residual norm 4.009247390166e+03 >>> 5 KSP Residual norm 3.928417371428e+03 >>> 6 KSP Residual norm 3.865152075780e+03 >>> 7 KSP Residual norm 3.795606446033e+03 >>> 8 KSP Residual norm 3.735294554158e+03 >>> 9 KSP Residual norm 3.674393726487e+03 >>> 10 KSP Residual norm 3.617795166786e+03 >>> 11 KSP Residual norm 3.563807982274e+03 >>> 12 KSP Residual norm 3.512269444921e+03 >>> 13 KSP Residual norm 3.455110223236e+03 >>> 14 KSP Residual norm 3.407141247372e+03 >>> 15 KSP Residual norm 3.356562415982e+03 >>> 16 KSP Residual norm 3.312720047685e+03 >>> 17 KSP Residual norm 3.263690150810e+03 >>> 18 KSP Residual norm 3.219359862444e+03 >>> 19 KSP Residual norm 3.173500955995e+03 >>> 20 KSP Residual norm 3.127528790155e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172556e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> >>> >>> 0 SNES Function norm 3.424003312857e+04 >>> 0 KSP Residual norm 3.424003312857e+04 >>> 1 KSP Residual norm 2.886124328003e+04 >>> 2 KSP Residual norm 2.504664994221e+04 >>> 3 KSP Residual norm 2.104615835130e+04 >>> 4 KSP Residual norm 1.938102896610e+04 >>> 5 KSP Residual norm 1.793774642406e+04 >>> 6 KSP Residual norm 1.671392566981e+04 >>> 7 KSP Residual norm 1.501504103854e+04 >>> 8 KSP Residual norm 1.366362900726e+04 >>> 9 KSP Residual norm 1.240398500414e+04 >>> 10 KSP Residual norm 1.156293733914e+04 >>> 11 KSP Residual norm 1.066296477972e+04 >>> 12 KSP Residual norm 9.835601967036e+03 >>> 13 KSP Residual norm 9.017480191500e+03 >>> 14 KSP Residual norm 8.415336139732e+03 >>> 15 KSP Residual norm 7.807497808414e+03 >>> 16 KSP Residual norm 7.341703768300e+03 >>> 17 KSP Residual norm 6.979298049244e+03 >>> 18 KSP Residual norm 6.521277772042e+03 >>> 19 KSP Residual norm 6.174842408713e+03 >>> 20 KSP Residual norm 5.889819664983e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 1.000525348435e+04 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 1.000525348435e+04 >>> 0 KSP Residual norm 1.000525348435e+04 >>> 1 KSP Residual norm 7.908741565645e+03 >>> 2 KSP Residual norm 6.825263536988e+03 >>> 3 KSP Residual norm 6.224930664967e+03 >>> 4 KSP Residual norm 6.095547180474e+03 >>> 5 KSP Residual norm 5.952968230397e+03 >>> 6 KSP Residual norm 5.861251998127e+03 >>> 7 KSP Residual norm 5.712439327726e+03 >>> 8 KSP Residual norm 5.583056913167e+03 >>> 9 KSP Residual norm 5.461768804526e+03 >>> 10 KSP Residual norm 5.351937611030e+03 >>> 11 KSP Residual norm 5.224288337536e+03 >>> 12 KSP Residual norm 5.129863847028e+03 >>> 13 KSP Residual norm 5.010818237161e+03 >>> 14 KSP Residual norm 4.907162936143e+03 >>> 15 KSP Residual norm 4.789564773923e+03 >>> 16 KSP Residual norm 4.695173370709e+03 >>> 17 KSP Residual norm 4.584070962145e+03 >>> 18 KSP Residual norm 4.483061424714e+03 >>> 19 KSP Residual norm 4.373384070713e+03 >>> 20 KSP Residual norm 4.260704657576e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 4.662386014874e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 0 SNES Function norm 4.662386014874e+03 >>> 0 KSP Residual norm 4.662386014874e+03 >>> 1 KSP Residual norm 4.408316259834e+03 >>> 2 KSP Residual norm 4.184867769891e+03 >>> 3 KSP Residual norm 4.079091244367e+03 >>> 4 KSP Residual norm 4.009247390184e+03 >>> 5 KSP Residual norm 3.928417371457e+03 >>> 6 KSP Residual norm 3.865152075802e+03 >>> 7 KSP Residual norm 3.795606446041e+03 >>> 8 KSP Residual norm 3.735294554160e+03 >>> 9 KSP Residual norm 3.674393726485e+03 >>> 10 KSP Residual norm 3.617795166775e+03 >>> 11 KSP Residual norm 3.563807982249e+03 >>> 12 KSP Residual norm 3.512269444873e+03 >>> 13 KSP Residual norm 3.455110223193e+03 >>> 14 KSP Residual norm 3.407141247334e+03 >>> 15 KSP Residual norm 3.356562415949e+03 >>> 16 KSP Residual norm 3.312720047652e+03 >>> 17 KSP Residual norm 3.263690150782e+03 >>> 18 KSP Residual norm 3.219359862425e+03 >>> 19 KSP Residual norm 3.173500955997e+03 >>> 20 KSP Residual norm 3.127528790156e+03 >>> Linear solve converged due to CONVERGED_ITS iterations 20 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> 1 SNES Function norm 3.186752172503e+03 >>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>> SNES Object: 1 MPI process >>> type: newtonls >>> maximum iterations=1, maximum function evaluations=-1 >>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>> total number of linear solver iterations=20 >>> total number of function evaluations=2 >>> norm schedule ALWAYS >>> Jacobian is never rebuilt >>> Jacobian is built using finite differences with coloring >>> SNESLineSearch Object: 1 MPI process >>> type: basic >>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>> lambda=1.000000e-08 >>> maximum iterations=40 >>> KSP Object: 1 MPI process >>> type: gmres >>> restart=30, using Classical (unmodified) Gram-Schmidt >>> Orthogonalization with no iterative refinement >>> happy breakdown tolerance 1e-30 >>> maximum iterations=20, initial guess is zero >>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>> left preconditioning >>> using PRECONDITIONED norm type for convergence test >>> PC Object: 1 MPI process >>> type: none >>> linear system matrix = precond matrix: >>> Mat Object: 1 MPI process >>> type: seqbaij >>> rows=16384, cols=16384, bs=16 >>> total: nonzeros=1277952, allocated nonzeros=1277952 >>> total number of mallocs used during MatSetValues calls=0 >>> block size is 16 >>> >>> On Thu, May 4, 2023 at 5:22 PM Matthew Knepley <[email protected]> >>> wrote: >>> >>>> On Thu, May 4, 2023 at 5:03 PM Mark Lohry <[email protected]> wrote: >>>> >>>>> Do you get different results (in different runs) without >>>>>> -snes_mf_operator? So just using an explicit matrix? >>>>> >>>>> >>>>> Unfortunately I don't have an explicit matrix available for this, >>>>> hence the MFFD/JFNK. >>>>> >>>> >>>> I don't mean the actual matrix, I mean a representative matrix. >>>> >>>> >>>>> >>>>>> (Note: I am not convinced there is even a problem and think it may >>>>>> be simply different order of floating point operations in different >>>>>> runs.) >>>>>> >>>>> >>>>> I'm not convinced either, but running explicit RK for 10,000 >>>>> iterations i get exactly the same results every time so i'm fairly >>>>> confident it's not the residual evaluation. >>>>> How would there be a different order of floating point ops in >>>>> different runs in serial? >>>>> >>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>>> that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>>> I could do the sparse finite difference jacobian once, save it to >>>>> disk, and then use that system each time. >>>>> >>>> >>>> Yes. That would work. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> On Thu, May 4, 2023 at 4:57 PM Matthew Knepley <[email protected]> >>>>> wrote: >>>>> >>>>>> On Thu, May 4, 2023 at 4:44 PM Mark Lohry <[email protected]> wrote: >>>>>> >>>>>>> Is your code valgrind clean? >>>>>>>> >>>>>>> >>>>>>> Yes, I also initialize all allocations with NaNs to be sure I'm not >>>>>>> using anything uninitialized. >>>>>>> >>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual >>>>>>>> matrix and run. Do you see any variability? >>>>>>>> >>>>>>> >>>>>>> I think I did what you're asking. I have -snes_mf_operator set, and >>>>>>> then SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where >>>>>>> diag_ones is a matrix with ones on the diagonal. Two runs below, still >>>>>>> with >>>>>>> differences but sometimes identical. >>>>>>> >>>>>> >>>>>> No, I mean without -snes_mf_* (as Barry says), so we are just running >>>>>> that solver with a sparse matrix. This would give me confidence >>>>>> that nothing in the solver is variable. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>> 3 KSP Residual norm 2.131675872968e+04 >>>>>>> 4 KSP Residual norm 1.973129814235e+04 >>>>>>> 5 KSP Residual norm 1.832377856317e+04 >>>>>>> 6 KSP Residual norm 1.716783617436e+04 >>>>>>> 7 KSP Residual norm 1.583963149542e+04 >>>>>>> 8 KSP Residual norm 1.482272170304e+04 >>>>>>> 9 KSP Residual norm 1.380312106742e+04 >>>>>>> 10 KSP Residual norm 1.297793480658e+04 >>>>>>> 11 KSP Residual norm 1.208599123244e+04 >>>>>>> 12 KSP Residual norm 1.137345655227e+04 >>>>>>> 13 KSP Residual norm 1.059676909366e+04 >>>>>>> 14 KSP Residual norm 1.003823862398e+04 >>>>>>> 15 KSP Residual norm 9.425879221354e+03 >>>>>>> 16 KSP Residual norm 8.954805890038e+03 >>>>>>> 17 KSP Residual norm 8.592372470456e+03 >>>>>>> 18 KSP Residual norm 8.060707175821e+03 >>>>>>> 19 KSP Residual norm 7.782057728723e+03 >>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>> coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>> lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>> 3 KSP Residual norm 2.131675873776e+04 >>>>>>> 4 KSP Residual norm 1.973129814908e+04 >>>>>>> 5 KSP Residual norm 1.832377852186e+04 >>>>>>> 6 KSP Residual norm 1.716783608174e+04 >>>>>>> 7 KSP Residual norm 1.583963128956e+04 >>>>>>> 8 KSP Residual norm 1.482272160069e+04 >>>>>>> 9 KSP Residual norm 1.380312087005e+04 >>>>>>> 10 KSP Residual norm 1.297793458796e+04 >>>>>>> 11 KSP Residual norm 1.208599115602e+04 >>>>>>> 12 KSP Residual norm 1.137345657533e+04 >>>>>>> 13 KSP Residual norm 1.059676906197e+04 >>>>>>> 14 KSP Residual norm 1.003823857515e+04 >>>>>>> 15 KSP Residual norm 9.425879177747e+03 >>>>>>> 16 KSP Residual norm 8.954805850825e+03 >>>>>>> 17 KSP Residual norm 8.592372413320e+03 >>>>>>> 18 KSP Residual norm 8.060706994110e+03 >>>>>>> 19 KSP Residual norm 7.782057560782e+03 >>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>> SNES Object: 1 MPI process >>>>>>> type: newtonls >>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>> total number of linear solver iterations=20 >>>>>>> total number of function evaluations=23 >>>>>>> norm schedule ALWAYS >>>>>>> Jacobian is never rebuilt >>>>>>> Jacobian is applied matrix-free with differencing >>>>>>> Preconditioning Jacobian is built using finite differences with >>>>>>> coloring >>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>> type: basic >>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>> lambda=1.000000e-08 >>>>>>> maximum iterations=40 >>>>>>> KSP Object: 1 MPI process >>>>>>> type: gmres >>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>> Orthogonalization with no iterative refinement >>>>>>> happy breakdown tolerance 1e-30 >>>>>>> maximum iterations=20, initial guess is zero >>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>> left preconditioning >>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>> PC Object: 1 MPI process >>>>>>> type: none >>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>> Mat Object: 1 MPI process >>>>>>> type: mffd >>>>>>> rows=16384, cols=16384 >>>>>>> Matrix-free approximation: >>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>> Using wp compute h routine >>>>>>> Does not compute normU >>>>>>> Mat Object: 1 MPI process >>>>>>> type: seqaij >>>>>>> rows=16384, cols=16384 >>>>>>> total: nonzeros=16384, allocated nonzeros=16384 >>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>> not using I-node routines >>>>>>> >>>>>>> On Thu, May 4, 2023 at 10:10 AM Matthew Knepley <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> On Thu, May 4, 2023 at 8:54 AM Mark Lohry <[email protected]> wrote: >>>>>>>> >>>>>>>>> Try -pc_type none. >>>>>>>>>> >>>>>>>>> >>>>>>>>> With -pc_type none the 0 KSP residual looks identical. But >>>>>>>>> *sometimes* it's producing exactly the same history and others it's >>>>>>>>> gradually changing. I'm reasonably confident my residual evaluation >>>>>>>>> has no >>>>>>>>> randomness, see info after the petsc output. >>>>>>>>> >>>>>>>> >>>>>>>> We can try and test this. Replace your MatMFFD with an actual >>>>>>>> matrix and run. Do you see any variability? >>>>>>>> >>>>>>>> If not, then it could be your routine, or it could be MatMFFD. So >>>>>>>> run a few with -snes_view, and we can see if the >>>>>>>> "w" parameter changes. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> Matt >>>>>>>> >>>>>>>> >>>>>>>>> solve history 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 2, identical to 1: >>>>>>>>> >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276931041e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686034356e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015821006e+04 >>>>>>>>> >>>>>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, >>>>>>>>> growing difference to the end: >>>>>>>>> 0 SNES Function norm 3.424003312857e+04 >>>>>>>>> 0 KSP Residual norm 3.424003312857e+04 >>>>>>>>> 1 KSP Residual norm 2.871734444536e+04 >>>>>>>>> 2 KSP Residual norm 2.490276930242e+04 >>>>>>>>> ... >>>>>>>>> 20 KSP Residual norm 7.449686095424e+03 >>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>> 1 SNES Function norm 1.085015646971e+04 >>>>>>>>> >>>>>>>>> >>>>>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for >>>>>>>>> 10 iterations, so 30 calls of the same residual evaluation, identical >>>>>>>>> residuals every time >>>>>>>>> >>>>>>>>> run 1: >>>>>>>>> >>>>>>>>> # iteration rho rhou >>>>>>>>> rhov rhoE abs_res rel_res >>>>>>>>> umin vmax vmin >>>>>>>>> elapsed_time >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 >>>>>>>>> -1.35563e-14 >>>>>>>>> 6.34834e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 >>>>>>>>> -1.69925e-14 >>>>>>>>> 6.40063e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 >>>>>>>>> -1.59733e-14 >>>>>>>>> 6.45166e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 >>>>>>>>> -1.46238e-14 >>>>>>>>> 6.50494e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 >>>>>>>>> -1.06666e-14 >>>>>>>>> 6.55656e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 >>>>>>>>> -1.39713e-14 >>>>>>>>> 6.60872e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 >>>>>>>>> -2.39351e-14 >>>>>>>>> 6.66041e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 >>>>>>>>> 3.78905e-14 >>>>>>>>> 6.71316e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 >>>>>>>>> 1.09825e-13 >>>>>>>>> 6.76447e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 >>>>>>>>> -2.28736e-13 >>>>>>>>> 6.81716e-01 >>>>>>>>> >>>>>>>>> run N: >>>>>>>>> >>>>>>>>> >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> # iteration rho rhou >>>>>>>>> rhov rhoE abs_res rel_res >>>>>>>>> umin vmax vmin >>>>>>>>> elapsed_time >>>>>>>>> # >>>>>>>>> >>>>>>>>> >>>>>>>>> 1.00000e+00 1.086860616292e+00 2.782316758416e+02 >>>>>>>>> 4.482867643761e+00 2.993435920340e+02 2.04353e+02 >>>>>>>>> 1.00000e+00 -8.23945e-15 -6.15326e-15 >>>>>>>>> -1.35563e-14 >>>>>>>>> 6.23316e-01 >>>>>>>>> 2.00000e+00 2.310547487017e+00 1.079059352425e+02 >>>>>>>>> 3.958323921837e+00 5.058927165686e+02 2.58647e+02 >>>>>>>>> 1.26568e+00 -1.02539e-14 -9.35368e-15 >>>>>>>>> -1.69925e-14 >>>>>>>>> 6.28510e-01 >>>>>>>>> 3.00000e+00 2.361005867444e+00 5.706213331683e+01 >>>>>>>>> 6.130016323357e+00 4.688968362579e+02 2.36201e+02 >>>>>>>>> 1.15585e+00 -1.19370e-14 -1.15216e-14 >>>>>>>>> -1.59733e-14 >>>>>>>>> 6.33558e-01 >>>>>>>>> 4.00000e+00 2.167518999963e+00 3.757541401594e+01 >>>>>>>>> 6.313917437428e+00 4.054310291628e+02 2.03612e+02 >>>>>>>>> 9.96372e-01 -1.81831e-14 -1.28312e-14 >>>>>>>>> -1.46238e-14 >>>>>>>>> 6.38773e-01 >>>>>>>>> 5.00000e+00 1.941443738676e+00 2.884190334049e+01 >>>>>>>>> 6.237106158479e+00 3.539201037156e+02 1.77577e+02 >>>>>>>>> 8.68970e-01 3.56633e-14 -8.74089e-15 >>>>>>>>> -1.06666e-14 >>>>>>>>> 6.43887e-01 >>>>>>>>> 6.00000e+00 1.736947124693e+00 2.429485695670e+01 >>>>>>>>> 5.996962200407e+00 3.148280178142e+02 1.57913e+02 >>>>>>>>> 7.72745e-01 -8.98634e-14 -2.41152e-14 >>>>>>>>> -1.39713e-14 >>>>>>>>> 6.49073e-01 >>>>>>>>> 7.00000e+00 1.564153212635e+00 2.149609219810e+01 >>>>>>>>> 5.786910705204e+00 2.848717011033e+02 1.42872e+02 >>>>>>>>> 6.99144e-01 -2.95352e-13 -2.48158e-14 >>>>>>>>> -2.39351e-14 >>>>>>>>> 6.54167e-01 >>>>>>>>> 8.00000e+00 1.419280815384e+00 1.950619804089e+01 >>>>>>>>> 5.627281158306e+00 2.606623371229e+02 1.30728e+02 >>>>>>>>> 6.39715e-01 8.98941e-13 1.09674e-13 >>>>>>>>> 3.78905e-14 >>>>>>>>> 6.59394e-01 >>>>>>>>> 9.00000e+00 1.296115915975e+00 1.794843530745e+01 >>>>>>>>> 5.514933264437e+00 2.401524522393e+02 1.20444e+02 >>>>>>>>> 5.89394e-01 1.70717e-12 1.38762e-14 >>>>>>>>> 1.09825e-13 >>>>>>>>> 6.64516e-01 >>>>>>>>> 1.00000e+01 1.189639693918e+00 1.665381754953e+01 >>>>>>>>> 5.433183087037e+00 2.222572900473e+02 1.11475e+02 >>>>>>>>> 5.45501e-01 -4.22462e-12 -7.15206e-13 >>>>>>>>> -2.28736e-13 >>>>>>>>> 6.69677e-01 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, May 4, 2023 at 8:41 AM Mark Adams <[email protected]> wrote: >>>>>>>>> >>>>>>>>>> ASM is just the sub PC with one proc but gets weaker with more >>>>>>>>>> procs unless you use jacobi. (maybe I am missing something). >>>>>>>>>> >>>>>>>>>> On Thu, May 4, 2023 at 8:31 AM Mark Lohry <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Please send the output of -snes_view. >>>>>>>>>>>> >>>>>>>>>>> pasted below. anything stand out? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> SNES Object: 1 MPI process >>>>>>>>>>> type: newtonls >>>>>>>>>>> maximum iterations=1, maximum function evaluations=-1 >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, solution=1e-15 >>>>>>>>>>> total number of linear solver iterations=20 >>>>>>>>>>> total number of function evaluations=22 >>>>>>>>>>> norm schedule ALWAYS >>>>>>>>>>> Jacobian is never rebuilt >>>>>>>>>>> Jacobian is applied matrix-free with differencing >>>>>>>>>>> Preconditioning Jacobian is built using finite differences >>>>>>>>>>> with coloring >>>>>>>>>>> SNESLineSearch Object: 1 MPI process >>>>>>>>>>> type: basic >>>>>>>>>>> maxstep=1.000000e+08, minlambda=1.000000e-12 >>>>>>>>>>> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >>>>>>>>>>> lambda=1.000000e-08 >>>>>>>>>>> maximum iterations=40 >>>>>>>>>>> KSP Object: 1 MPI process >>>>>>>>>>> type: gmres >>>>>>>>>>> restart=30, using Classical (unmodified) Gram-Schmidt >>>>>>>>>>> Orthogonalization with no iterative refinement >>>>>>>>>>> happy breakdown tolerance 1e-30 >>>>>>>>>>> maximum iterations=20, initial guess is zero >>>>>>>>>>> tolerances: relative=0.1, absolute=1e-15, divergence=10. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using PRECONDITIONED norm type for convergence test >>>>>>>>>>> PC Object: 1 MPI process >>>>>>>>>>> type: asm >>>>>>>>>>> total subdomain blocks = 1, amount of overlap = 0 >>>>>>>>>>> restriction/interpolation type - RESTRICT >>>>>>>>>>> Local solver information for first block is in the >>>>>>>>>>> following KSP and PC objects on rank 0: >>>>>>>>>>> Use -ksp_view ::ascii_info_detail to display information >>>>>>>>>>> for all blocks >>>>>>>>>>> KSP Object: (sub_) 1 MPI process >>>>>>>>>>> type: preonly >>>>>>>>>>> maximum iterations=10000, initial guess is zero >>>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, >>>>>>>>>>> divergence=10000. >>>>>>>>>>> left preconditioning >>>>>>>>>>> using NONE norm type for convergence test >>>>>>>>>>> PC Object: (sub_) 1 MPI process >>>>>>>>>>> type: ilu >>>>>>>>>>> out-of-place factorization >>>>>>>>>>> 0 levels of fill >>>>>>>>>>> tolerance for zero pivot 2.22045e-14 >>>>>>>>>>> matrix ordering: natural >>>>>>>>>>> factor fill ratio given 1., needed 1. >>>>>>>>>>> Factored matrix follows: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> package used to perform factorization: petsc >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix = precond matrix: >>>>>>>>>>> Mat Object: (sub_) 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> linear system matrix followed by preconditioner matrix: >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: mffd >>>>>>>>>>> rows=16384, cols=16384 >>>>>>>>>>> Matrix-free approximation: >>>>>>>>>>> err=1.49012e-08 (relative error in function evaluation) >>>>>>>>>>> Using wp compute h routine >>>>>>>>>>> Does not compute normU >>>>>>>>>>> Mat Object: 1 MPI process >>>>>>>>>>> type: seqbaij >>>>>>>>>>> rows=16384, cols=16384, bs=16 >>>>>>>>>>> total: nonzeros=1277952, allocated nonzeros=1277952 >>>>>>>>>>> total number of mallocs used during MatSetValues calls=0 >>>>>>>>>>> block size is 16 >>>>>>>>>>> >>>>>>>>>>> On Thu, May 4, 2023 at 8:30 AM Mark Adams <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> If you are using MG what is the coarse grid solver? >>>>>>>>>>>> -snes_view might give you that. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, May 4, 2023 at 8:25 AM Matthew Knepley < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Thu, May 4, 2023 at 8:21 AM Mark Lohry <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>>> apart? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yes, this. I take it this sounds familiar? >>>>>>>>>>>>>> >>>>>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the >>>>>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is >>>>>>>>>>>>>> identical >>>>>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated >>>>>>>>>>>>>> applications to >>>>>>>>>>>>>> solve a steady state multigrid problem, though here just one >>>>>>>>>>>>>> level) the >>>>>>>>>>>>>> differences add up such that I might reach global convergence in >>>>>>>>>>>>>> 35 >>>>>>>>>>>>>> iterations or 38. It's not the end of the world, but I was >>>>>>>>>>>>>> expecting that >>>>>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the >>>>>>>>>>>>>> root cause >>>>>>>>>>>>>> would be. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> The initial KSP residual is different, so its the PC. >>>>>>>>>>>>> Please send the output of -snes_view. If your ASM is using direct >>>>>>>>>>>>> factorization, then it >>>>>>>>>>>>> could be randomness in whatever LU you are using. >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Matt >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639499595e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999809040e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048521958e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788043518111e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570851270214e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801309983345e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly >>>>>>>>>>>>>> different >>>>>>>>>>>>>> >>>>>>>>>>>>>> 0 SNES Function norm 2.801842107848e+04 >>>>>>>>>>>>>> 0 KSP Residual norm 4.045639473002e+01 >>>>>>>>>>>>>> 1 KSP Residual norm 1.917999883034e+01 >>>>>>>>>>>>>> 2 KSP Residual norm 1.616048572016e+01 >>>>>>>>>>>>>> [...] >>>>>>>>>>>>>> 19 KSP Residual norm 8.788046348957e-01 >>>>>>>>>>>>>> 20 KSP Residual norm 6.570859588610e-01 >>>>>>>>>>>>>> Linear solve converged due to CONVERGED_ITS iterations 20 >>>>>>>>>>>>>> 1 SNES Function norm 1.801311320322e+03 >>>>>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, May 3, 2023 at 11:05 PM Barry Smith <[email protected]> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Do they start very similarly and then slowly drift further >>>>>>>>>>>>>>> apart? That is the first couple of KSP iterations they are >>>>>>>>>>>>>>> almost identical >>>>>>>>>>>>>>> but then for each iteration get a bit further. Similar for the >>>>>>>>>>>>>>> SNES >>>>>>>>>>>>>>> iterations, starting close and then for more iterations and >>>>>>>>>>>>>>> more solves >>>>>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very >>>>>>>>>>>>>>> different? You >>>>>>>>>>>>>>> can run with -snes_monitor -ksp_monitor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This is on a single MPI rank. I haven't checked the >>>>>>>>>>>>>>> coloring, was just guessing there. But the solutions/residuals >>>>>>>>>>>>>>> are slightly >>>>>>>>>>>>>>> different from run to run. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should >>>>>>>>>>>>>>> expect bitwise identical results? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith <[email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> No, the coloring should be identical every time. Do you >>>>>>>>>>>>>>>> see differences with 1 MPI rank? (Or much smaller ones?). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry <[email protected]> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an >>>>>>>>>>>>>>>> MFFD/JFNK nonlinear solver where I give it the sparsity. PC >>>>>>>>>>>>>>>> asm, KSP gmres, >>>>>>>>>>>>>>>> with SNESSetLagJacobian -2 (compute once and then frozen >>>>>>>>>>>>>>>> jacobian). >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in >>>>>>>>>>>>>>>> residuals from run to run. I'm wondering where randomness >>>>>>>>>>>>>>>> might enter here >>>>>>>>>>>>>>>> -- does the jacobian coloring use a random seed? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> What most experimenters take for granted before they begin >>>>>>>>>>>>> their experiments is infinitely more interesting than any results >>>>>>>>>>>>> to which >>>>>>>>>>>>> their experiments lead. >>>>>>>>>>>>> -- Norbert Wiener >>>>>>>>>>>>> >>>>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> What most experimenters take for granted before they begin their >>>>>>>> experiments is infinitely more interesting than any results to which >>>>>>>> their >>>>>>>> experiments lead. >>>>>>>> -- Norbert Wiener >>>>>>>> >>>>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>> >>>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://www.cse.buffalo.edu/~knepley/ >>>> <http://www.cse.buffalo.edu/~knepley/> >>>> >>> >>> <configure.log> >> >> >> >
