On Wed, Aug 22, 2018 at 1:33 PM Ellen Price <[email protected]> wrote:
> For Barry: > > Okay, I'm stumped. Some of my Jacobian terms match up with the finite > difference one. Others are off by orders of magnitude. This led me to think > that maybe there was something wrong with my analytic Jacobian, so I went > back and started trying to track down which term causes the problem. The > weird thing is that Mathematica gives the same numerical result as my code > (within a small tolerance), whether i substitute numbers into its analytic > Jacobian or numerically differentiate the function I'm trying to solve > using Mathematica. I can't think of anything else to try for debugging the > Jacobian. Even if the function has a bug in it, Mathematica seems pretty > sure that I'm using the right Jacobian for what I put in... At the > suggestion of the FAQ, I also tried "-mat_fd_type ds" to see if that made > any difference, but it doesn't. Like I said, I'm stumped, but I'll keep > trying. > > For Matt: > > Using options "-pc_type lu -snes_view -snes_monitor -snes_converged_reason > -ksp_monitor_true_residual -ksp_converged_reason -snes_type newtontr", the > output is below. I thought that the step length convergence message had > something to do with using NEWTONTR, but I could be wrong about that. It > doesn't give that message when I switch to NEWTONLS. It also only gives > that message after the first iteration -- the first one is always > CONVERGED_RTOL. > > 7.533921e-03, 2.054699e+02 > 0 SNES Function norm 6.252612941119e+04 > 0 KSP preconditioned resid norm 1.027204021595e-01 true resid norm > 6.252612941119e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 2.641617517320e-17 true resid norm > 3.941610607093e-11 ||r(i)||/||b|| 6.303941478885e-16 > Linear solve converged due to CONVERGED_RTOL iterations 1 > Shoot, I forgot to have you add -snes_linesearch_monitor, but what that would show is that you cut down the step scaling lambda until its 1e-10 or something, so nothing changes in your residual norm. This means your Jacobian does not match your function. > 1 SNES Function norm 6.252085930204e+04 > 0 KSP preconditioned resid norm 3.857372064539e-01 true resid norm > 6.252085930204e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 3.664921095158e-16 true resid norm > 2.434875289829e-10 ||r(i)||/||b|| 3.894500678672e-15 > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > This is another indication that there is mismatch. Now the solution to your linear equation is almost 0, so Newton thinks you should not move at all. Matt > 2 SNES Function norm 6.251886926050e+04 > 0 KSP preconditioned resid norm 4.085000006938e-01 true resid norm > 6.251886926050e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 5.318519617927e-16 true resid norm > 8.189931340015e-10 ||r(i)||/||b|| 1.309993516660e-14 > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > ..... > 48 SNES Function norm 6.229932776333e+04 > 0 KSP preconditioned resid norm 4.872526135345e+00 true resid norm > 6.229932776333e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 7.281571852581e-15 true resid norm > 8.562155918387e-09 ||r(i)||/||b|| 1.374357673796e-13 > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > 49 SNES Function norm 6.229663411026e+04 > 0 KSP preconditioned resid norm 4.953303401996e+00 true resid norm > 6.229663411026e+04 ||r(i)||/||b|| 1.000000000000e+00 > 1 KSP preconditioned resid norm 1.537319689814e-15 true resid norm > 4.783442569088e-09 ||r(i)||/||b|| 7.678492806886e-14 > Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 > 50 SNES Function norm 6.229442686875e+04 > Nonlinear solve did not converge due to DIVERGED_MAX_IT iterations 50 > SNES Object: 2 MPI processes > type: newtontr > Trust region tolerance (-snes_trtol) > mu=0.25, eta=0.75, sigma=0.0001 > delta0=0.2, delta1=0.3, delta2=0.75, delta3=2. > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=50 > total number of function evaluations=58 > norm schedule ALWAYS > SNESLineSearch Object: 2 MPI processes > type: basic > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=1 > KSP Object: 2 MPI processes > type: gmres > restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 2 MPI processes > type: lu > out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: natural > factor fill ratio given 0., needed 0. > Factored matrix follows: > Mat Object: 2 MPI processes > type: mumps > rows=20, cols=20 > package used to perform factorization: mumps > total: nonzeros=112, allocated nonzeros=112 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 0 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequential matrix ordering):7 > ICNTL(8) (scaling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 1 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 3 > ICNTL(19) (Schur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (solution struct): 1 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): > -32 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > ICNTL(35) (activate BLR based factorization): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absolute pivoting threshold): 0. > CNTL(4) (value of static pivoting): -1. > CNTL(5) (fixation for null pivots): 0. > CNTL(7) (dropping parameter for BLR): 0. > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 127. > [1] 155. > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 16. > [1] 16. > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 127. > [1] 155. > INFO(15) (estimated size of (in MB) MUMPS internal data > for running numerical factorization): > [0] 1 > [1] 1 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 1 > [1] 1 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 10 > [1] 10 > RINFOG(1) (global estimated flops for the elimination > after analysis): 282. > RINFOG(2) (global estimated flops for the assembly after > factorization): 32. > RINFOG(3) (global estimated flops for the elimination > after factorization): 282. > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): > (0.,0.)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 112 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 225 > INFOG(5) (estimated maximum front size in the complete > tree): 4 > INFOG(6) (number of nodes in the complete tree): 9 > INFOG(7) (ordering option effectively use after analysis): > 2 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 100 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 112 > INFOG(10) (total integer space store the matrix factors > after factorization): 225 > INFOG(11) (order of largest frontal matrix after > factorization): 4 > INFOG(12) (number of off-diagonal pivots): 0 > INFOG(13) (number of delayed pivots after factorization): > 0 > INFOG(14) (number of memory compress after factorization): > 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal > data for factorization after analysis: value on the most memory consuming > processor): 1 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 2 > INFOG(18) (size of all MUMPS internal data allocated > during factorization: value on the most memory consuming processor): 1 > INFOG(19) (size of all MUMPS internal data allocated > during factorization: sum over all processors): 2 > INFOG(20) (estimated number of entries in the factors): > 112 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 1 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 2 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified > by static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of > entries in the factors (sum over all processors)): 112 > INFOG(30, 31) (after solution: size in Mbytes of memory > used during solution phase): 0, 0 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 2 MPI processes > type: mpiaij > rows=20, cols=20, bs=2 > total: nonzeros=112, allocated nonzeros=240 > total number of mallocs used during MatSetValues calls =0 > > Ellen > > On Tue, Aug 21, 2018 at 10:16 PM Matthew Knepley <[email protected]> > wrote: > >> On Tue, Aug 21, 2018 at 10:00 PM Ellen M. Price < >> [email protected]> wrote: >> >>> Hi PETSc users, >>> >>> I'm having trouble applying SNES to a new problem I'm working on. I'll >>> try to be as complete as possible but can't post the full code because >>> it's ongoing research and is pretty long anyway. >>> >>> The nonlinear problem arises from trying to solve a set of two coupled >>> ODEs using a Galerkin method. I am using Mathematica to generate the >>> objective function to solve and the Jacobian, so I *think* I can rule >>> out human error on that front. >>> >>> There are four things I can easily change: >>> >>> - number of DMDA grid points N (I've tried 100 and 1000) >>> - preconditioner (I've tried LU and SVD, LU appears to work better, and >>> SVD is too slow for N = 1000) >>> - linear solver (haven't played with this much, but sometimes smaller >>> tolerances work better) >>> - nonlinear solver (what I'm having trouble with) >>> >>> Trying different solvers should, as far as I'm aware, give comparable >>> answers, but that's not the case here. NEWTONTR works best of the ones >>> I've tried, but I'm suspicious that the function value barely decreases >>> before SNES "converges" -- and none of the options I've tried changing >>> seem to affect this, as it just finds another reason to converge without >>> making any real progress. For example: >>> >>> 0 SNES Function norm 7.197788479418e+00 >>> 0 KSP Residual norm 1.721996766619e+01 >>> 1 KSP Residual norm 5.186021549059e-14 >>> Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >>> >> >> This makes no sense. LU should converge due to atol or rtol. Send the >> output of >> >> -snes_view -snes_monitor -snes_converged_reason >> -ksp_monitor_true_residual -ksp_converged_reason >> >> Thanks, >> >> Matt >> >> >>> 1 SNES Function norm 7.197777674987e+00 >>> 0 KSP Residual norm 3.296112185897e+01 >>> 1 KSP Residual norm 2.713415432045e-13 >>> Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >>> ..... >>> 50 SNES Function norm 7.197376803542e+00 >>> 0 KSP Residual norm 6.222518656302e+02 >>> 1 KSP Residual norm 9.630659996504e-12 >>> Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1 >>> 51 SNES Function norm 7.197376803542e+00 >>> Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 50 >>> >>> I've tried everything I can think of and the FAQ suggestions, including >>> non-dimensionalizing the problem; I observe the same behavior either >>> way. The particularly strange thing I can't understand is why some of >>> the SNES methods fail outright, after just one iteration (NCG, NGMRES, >>> and others) with DIVERGED_DTOL. Unless I've misunderstood, it seems >>> like, for the most part, I should be able to substitute in one method >>> for another, possibly adjusting a few parameters along the way. >>> Furthermore, the default method, NEWTONLS, diverges with >>> DIVERGED_LINE_SEARCH, which I'm not sure how to debug. >>> >>> So the only viable method is NEWTONTR, and that doesn't appear to >>> "really" converge. Any suggestions for further things to try are >>> appreciated. My current options are: >>> >>> -pc_type lu -snes_monitor -snes_converged_reason -ksp_converged_reason >>> -snes_max_it 10000 -ksp_monitor -snes_type newtonls >>> >>> Thanks in advance, >>> Ellen Price >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
