My guess is the Jacobian is not correct (or correct "enough"), hence PETSc SNES is generating a poor descent direction. You can try -snes_mf_operator -ksp_monitor_true residual as additional arguments. What happens?
Barry > On Sep 7, 2015, at 7:49 PM, Gideon Simpson <[email protected]> wrote: > > No problem Matt, I don’t think we had previously discussed that output. Here > is a case where things fail. > > 0 SNES Function norm 4.027481756921e-09 > 1 SNES Function norm 1.760477878365e-12 > Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1 > 0 SNES Function norm 5.066222213176e+03 > 1 SNES Function norm 8.484697184230e+02 > 2 SNES Function norm 6.549559723294e+02 > 3 SNES Function norm 5.770723278153e+02 > 4 SNES Function norm 5.237702240594e+02 > 5 SNES Function norm 4.753909019848e+02 > 6 SNES Function norm 4.221784590755e+02 > 7 SNES Function norm 3.806525080483e+02 > 8 SNES Function norm 3.762054656019e+02 > 9 SNES Function norm 3.758975226873e+02 > 10 SNES Function norm 3.757032042706e+02 > 11 SNES Function norm 3.728798164234e+02 > 12 SNES Function norm 3.723078741075e+02 > 13 SNES Function norm 3.721848059825e+02 > 14 SNES Function norm 3.720227575629e+02 > 15 SNES Function norm 3.720051998555e+02 > 16 SNES Function norm 3.718945430587e+02 > 17 SNES Function norm 3.700412694044e+02 > 18 SNES Function norm 3.351964889461e+02 > 19 SNES Function norm 3.096016086233e+02 > 20 SNES Function norm 3.008410789787e+02 > 21 SNES Function norm 2.752316716557e+02 > 22 SNES Function norm 2.707658474165e+02 > 23 SNES Function norm 2.698436736049e+02 > 24 SNES Function norm 2.618233857172e+02 > 25 SNES Function norm 2.600121920634e+02 > 26 SNES Function norm 2.585046423168e+02 > 27 SNES Function norm 2.568551090220e+02 > 28 SNES Function norm 2.556404537064e+02 > 29 SNES Function norm 2.536353523683e+02 > 30 SNES Function norm 2.533596070171e+02 > 31 SNES Function norm 2.532324379596e+02 > 32 SNES Function norm 2.531842335211e+02 > 33 SNES Function norm 2.531684527520e+02 > 34 SNES Function norm 2.531637604618e+02 > 35 SNES Function norm 2.531624767821e+02 > 36 SNES Function norm 2.531621359093e+02 > 37 SNES Function norm 2.531620504925e+02 > 38 SNES Function norm 2.531620350055e+02 > 39 SNES Function norm 2.531620310522e+02 > 40 SNES Function norm 2.531620300471e+02 > 41 SNES Function norm 2.531620298084e+02 > 42 SNES Function norm 2.531620297478e+02 > 43 SNES Function norm 2.531620297324e+02 > 44 SNES Function norm 2.531620297303e+02 > 45 SNES Function norm 2.531620297302e+02 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45 > 0 SNES Function norm 9.636339304380e+03 > 1 SNES Function norm 8.997731184634e+03 > 2 SNES Function norm 8.120498349232e+03 > 3 SNES Function norm 7.322379894820e+03 > 4 SNES Function norm 6.599581599149e+03 > 5 SNES Function norm 6.374872854688e+03 > 6 SNES Function norm 6.372518007653e+03 > 7 SNES Function norm 6.073996314301e+03 > 8 SNES Function norm 5.635965277054e+03 > 9 SNES Function norm 5.155389064046e+03 > 10 SNES Function norm 5.080567902638e+03 > 11 SNES Function norm 5.058878643969e+03 > 12 SNES Function norm 5.058835649793e+03 > 13 SNES Function norm 5.058491285707e+03 > 14 SNES Function norm 5.057452865337e+03 > 15 SNES Function norm 5.057226140688e+03 > 16 SNES Function norm 5.056651272898e+03 > 17 SNES Function norm 5.056575190057e+03 > 18 SNES Function norm 5.056574632598e+03 > 19 SNES Function norm 5.056574520229e+03 > 20 SNES Function norm 5.056574492569e+03 > 21 SNES Function norm 5.056574485124e+03 > 22 SNES Function norm 5.056574483029e+03 > 23 SNES Function norm 5.056574482427e+03 > 24 SNES Function norm 5.056574482302e+03 > 25 SNES Function norm 5.056574482287e+03 > 26 SNES Function norm 5.056574482282e+03 > 27 SNES Function norm 5.056574482281e+03 > Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27 > SNES Object: 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=28 > total number of function evaluations=323 > total number of grid sequence refinements=2 > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=15991, cols=15991 > package used to perform factorization: mumps > total: nonzeros=255801, allocated nonzeros=255801 > total number of mallocs used during MatSetValues calls =0 > MUMPS run parameters: > SYM (matrix type): 0 > PAR (host participation): 1 > ICNTL(1) (output for error): 6 > ICNTL(2) (output of diagnostic msg): 0 > ICNTL(3) (output for global info): 0 > ICNTL(4) (level of printing): 0 > ICNTL(5) (input mat struct): 0 > ICNTL(6) (matrix prescaling): 7 > ICNTL(7) (sequentia matrix ordering):6 > ICNTL(8) (scalling strategy): 77 > ICNTL(10) (max num of refinements): 0 > ICNTL(11) (error analysis): 0 > ICNTL(12) (efficiency control): 1 > ICNTL(13) (efficiency control): 0 > ICNTL(14) (percentage of estimated workspace increase): 20 > ICNTL(18) (input mat struct): 0 > ICNTL(19) (Shur complement info): 0 > ICNTL(20) (rhs sparse pattern): 0 > ICNTL(21) (somumpstion struct): 0 > ICNTL(22) (in-core/out-of-core facility): 0 > ICNTL(23) (max size of memory can be allocated locally):0 > ICNTL(24) (detection of null pivot rows): 0 > ICNTL(25) (computation of a null space basis): 0 > ICNTL(26) (Schur options for rhs or solution): 0 > ICNTL(27) (experimental parameter): -8 > ICNTL(28) (use parallel or sequential ordering): 1 > ICNTL(29) (parallel ordering): 0 > ICNTL(30) (user-specified set of entries in inv(A)): 0 > ICNTL(31) (factors is discarded in the solve phase): 0 > ICNTL(33) (compute determinant): 0 > CNTL(1) (relative pivoting threshold): 0.01 > CNTL(2) (stopping criterion of refinement): 1.49012e-08 > CNTL(3) (absomumpste pivoting threshold): 0 > CNTL(4) (vamumpse of static pivoting): -1 > CNTL(5) (fixation for null pivots): 0 > RINFO(1) (local estimated flops for the elimination after > analysis): > [0] 1.95838e+06 > RINFO(2) (local estimated flops for the assembly after > factorization): > [0] 143924 > RINFO(3) (local estimated flops for the elimination after > factorization): > [0] 1.95943e+06 > INFO(15) (estimated size of (in MB) MUMPS internal data for > running numerical factorization): > [0] 7 > INFO(16) (size of (in MB) MUMPS internal data used during > numerical factorization): > [0] 7 > INFO(23) (num of pivots eliminated on this processor after > factorization): > [0] 15991 > RINFOG(1) (global estimated flops for the elimination after > analysis): 1.95838e+06 > RINFOG(2) (global estimated flops for the assembly after > factorization): 143924 > RINFOG(3) (global estimated flops for the elimination after > factorization): 1.95943e+06 > (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0) > INFOG(3) (estimated real workspace for factors on all > processors after analysis): 255801 > INFOG(4) (estimated integer workspace for factors on all > processors after analysis): 127874 > INFOG(5) (estimated maximum front size in the complete tree): > 11 > INFOG(6) (number of nodes in the complete tree): 3996 > INFOG(7) (ordering option effectively use after analysis): 6 > INFOG(8) (structural symmetry in percent of the permuted > matrix after analysis): 86 > INFOG(9) (total real/complex workspace to store the matrix > factors after factorization): 255865 > INFOG(10) (total integer space store the matrix factors after > factorization): 127890 > INFOG(11) (order of largest frontal matrix after > factorization): 11 > INFOG(12) (number of off-diagonal pivots): 19 > INFOG(13) (number of delayed pivots after factorization): 8 > INFOG(14) (number of memory compress after factorization): 0 > INFOG(15) (number of steps of iterative refinement after > solution): 0 > INFOG(16) (estimated size (in MB) of all MUMPS internal data > for factorization after analysis: value on the most memory consuming > processor): 7 > INFOG(17) (estimated size of all MUMPS internal data for > factorization after analysis: sum over all processors): 7 > INFOG(18) (size of all MUMPS internal data allocated during > factorization: value on the most memory consuming processor): 7 > INFOG(19) (size of all MUMPS internal data allocated during > factorization: sum over all processors): 7 > INFOG(20) (estimated number of entries in the factors): > 255801 > INFOG(21) (size in MB of memory effectively used during > factorization - value on the most memory consuming processor): 7 > INFOG(22) (size in MB of memory effectively used during > factorization - sum over all processors): 7 > INFOG(23) (after analysis: value of ICNTL(6) effectively > used): 0 > INFOG(24) (after analysis: value of ICNTL(12) effectively > used): 1 > INFOG(25) (after factorization: number of pivots modified by > static pivoting): 0 > INFOG(28) (after factorization: number of null pivots > encountered): 0 > INFOG(29) (after factorization: effective number of entries > in the factors (sum over all processors)): 255865 > INFOG(30, 31) (after solution: size in Mbytes of memory used > during solution phase): 5, 5 > INFOG(32) (after analysis: type of analysis done): 1 > INFOG(33) (value used for ICNTL(8)): 7 > INFOG(34) (exponent of the determinant if determinant is > requested): 0 > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=15991, cols=15991 > total: nonzeros=223820, allocated nonzeros=431698 > total number of mallocs used during MatSetValues calls =15991 > using I-node routines: found 4000 nodes, limit used is 5 > > > > > -gideon > >> On Sep 7, 2015, at 8:40 PM, Matthew Knepley <[email protected]> wrote: >> >> On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson <[email protected]> >> wrote: >> Barry, >> >> I finally got a chance to really try using the grid sequencing within my >> code. I find that, in some cases, even if it can solve successfully on the >> coarsest mesh, the SNES fails, usually due to a line search failure, when it >> tries to compute along the grid sequence. Would you have any suggestions? >> >> I apologize if I have asked before, but can you give me -snes_view for the >> solver? I could not find it in the email thread. >> >> I would suggest trying to fiddle with the line search, or precondition it >> with Richardson. It would be nice to see -snes_monitor >> for the runs that fail, and then we can break down the residual into fields >> and look at it again (if my custom residual monitor >> does not work we can write one easily). Seeing which part of the residual >> does not converge is key to designing the NASM >> for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai, >> present it. We need better monitoring in PETSc. >> >> Thanks, >> >> Matt >> >> -gideon >> >>> On Aug 28, 2015, at 4:21 PM, Barry Smith <[email protected]> wrote: >>> >>> >>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <[email protected]> >>>> wrote: >>>> >>>> Yes, if i continue in this parameter on the coarse mesh, I can generally >>>> solve at all values. I do find that I need to do some amount of >>>> continuation to solve near the endpoint. The problem is that on the >>>> coarse mesh, things are not fully resolved at all the values along the >>>> continuation parameter, and I would like to do refinement. >>>> >>>> One subtlety is that I actually want the intermediate continuation >>>> solutions too. Currently, without doing any grid sequence, I compute >>>> each, write it to disk, and then go on to the next one. So I now need to >>>> go back an refine them. I was thinking that perhaps I could refine them >>>> on the fly, dump them to disk, and use the coarse solution as the starting >>>> guess at the next iteration, but that would seem to require resetting the >>>> snes back to the coarse grid. >>>> >>>> The alternative would be to just script the mesh refinement in a post >>>> processing stage, where each value of the continuation is parameter is >>>> loaded on the coarse mesh, and refined. Perhaps that’s the most practical >>>> thing to do. >>> >>> I would do the following. Create your DM and create a SNES that will do >>> the continuation >>> >>> loop over continuation parameter >>> >>> SNESSolve(snes,NULL,Ucoarse); >>> >>> if (you decide you want to see the refined solution at this >>> continuation point) { >>> SNESCreate(comm,&snesrefine); >>> SNESSetDM() >>> etc >>> SNESSetGridSequence(snesrefine,) >>> SNESSolve(snesrefine,0,Ucoarse); >>> SNESGetSolution(snesrefine,&Ufine); >>> VecView(Ufine or do whatever you want to do with the Ufine at >>> that continuation point >>> SNESDestroy(snesrefine); >>> end if >>> >>> end loop over continuation parameter. >>> >>> Barry >>> >>>> >>>> -gideon >>>> >>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <[email protected]> wrote: >>>>> >>>>>> >>>>>> >>>>>> 3. This problem is actually part of a continuation problem that roughly >>>>>> looks like this >>>>>> >>>>>> for( continuation parameter p = 0 to 1){ >>>>>> >>>>>> solve with parameter p_i using solution from p_{i-1}, >>>>>> } >>>>>> >>>>>> What I would like to do is to start the solver, for each value of >>>>>> parameter p_i on the coarse mesh, and then do grid sequencing on that. >>>>>> But it appears that after doing grid sequencing on the initial p_0 = 0, >>>>>> the SNES is set to use the finer mesh. >>>>> >>>>> So you are using continuation to give you a good enough initial guess on >>>>> the coarse level to even get convergence on the coarse level? First I >>>>> would check if you even need the continuation (or can you not even solve >>>>> the coarse problem without it). >>>>> >>>>> If you do need the continuation then you will need to tweak how you do >>>>> the grid sequencing. I think this will work: >>>>> >>>>> Do not use -snes_grid_sequencing >>>>> >>>>> Run SNESSolve() as many times as you want with your continuation >>>>> parameter. This will all happen on the coarse mesh. >>>>> >>>>> Call SNESSetGridSequence() >>>>> >>>>> Then call SNESSolve() again and it will do one solve on the coarse level >>>>> and then interpolate to the next level etc. >>>> >>> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener >
