No problem Matt, I don’t think we had previously discussed that output. Here
is a case where things fail.
0 SNES Function norm 4.027481756921e-09
1 SNES Function norm 1.760477878365e-12
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 1
0 SNES Function norm 5.066222213176e+03
1 SNES Function norm 8.484697184230e+02
2 SNES Function norm 6.549559723294e+02
3 SNES Function norm 5.770723278153e+02
4 SNES Function norm 5.237702240594e+02
5 SNES Function norm 4.753909019848e+02
6 SNES Function norm 4.221784590755e+02
7 SNES Function norm 3.806525080483e+02
8 SNES Function norm 3.762054656019e+02
9 SNES Function norm 3.758975226873e+02
10 SNES Function norm 3.757032042706e+02
11 SNES Function norm 3.728798164234e+02
12 SNES Function norm 3.723078741075e+02
13 SNES Function norm 3.721848059825e+02
14 SNES Function norm 3.720227575629e+02
15 SNES Function norm 3.720051998555e+02
16 SNES Function norm 3.718945430587e+02
17 SNES Function norm 3.700412694044e+02
18 SNES Function norm 3.351964889461e+02
19 SNES Function norm 3.096016086233e+02
20 SNES Function norm 3.008410789787e+02
21 SNES Function norm 2.752316716557e+02
22 SNES Function norm 2.707658474165e+02
23 SNES Function norm 2.698436736049e+02
24 SNES Function norm 2.618233857172e+02
25 SNES Function norm 2.600121920634e+02
26 SNES Function norm 2.585046423168e+02
27 SNES Function norm 2.568551090220e+02
28 SNES Function norm 2.556404537064e+02
29 SNES Function norm 2.536353523683e+02
30 SNES Function norm 2.533596070171e+02
31 SNES Function norm 2.532324379596e+02
32 SNES Function norm 2.531842335211e+02
33 SNES Function norm 2.531684527520e+02
34 SNES Function norm 2.531637604618e+02
35 SNES Function norm 2.531624767821e+02
36 SNES Function norm 2.531621359093e+02
37 SNES Function norm 2.531620504925e+02
38 SNES Function norm 2.531620350055e+02
39 SNES Function norm 2.531620310522e+02
40 SNES Function norm 2.531620300471e+02
41 SNES Function norm 2.531620298084e+02
42 SNES Function norm 2.531620297478e+02
43 SNES Function norm 2.531620297324e+02
44 SNES Function norm 2.531620297303e+02
45 SNES Function norm 2.531620297302e+02
Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 45
0 SNES Function norm 9.636339304380e+03
1 SNES Function norm 8.997731184634e+03
2 SNES Function norm 8.120498349232e+03
3 SNES Function norm 7.322379894820e+03
4 SNES Function norm 6.599581599149e+03
5 SNES Function norm 6.374872854688e+03
6 SNES Function norm 6.372518007653e+03
7 SNES Function norm 6.073996314301e+03
8 SNES Function norm 5.635965277054e+03
9 SNES Function norm 5.155389064046e+03
10 SNES Function norm 5.080567902638e+03
11 SNES Function norm 5.058878643969e+03
12 SNES Function norm 5.058835649793e+03
13 SNES Function norm 5.058491285707e+03
14 SNES Function norm 5.057452865337e+03
15 SNES Function norm 5.057226140688e+03
16 SNES Function norm 5.056651272898e+03
17 SNES Function norm 5.056575190057e+03
18 SNES Function norm 5.056574632598e+03
19 SNES Function norm 5.056574520229e+03
20 SNES Function norm 5.056574492569e+03
21 SNES Function norm 5.056574485124e+03
22 SNES Function norm 5.056574483029e+03
23 SNES Function norm 5.056574482427e+03
24 SNES Function norm 5.056574482302e+03
25 SNES Function norm 5.056574482287e+03
26 SNES Function norm 5.056574482282e+03
27 SNES Function norm 5.056574482281e+03
Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 27
SNES Object: 1 MPI processes
type: newtonls
maximum iterations=50, maximum function evaluations=10000
tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
total number of linear solver iterations=28
total number of function evaluations=323
total number of grid sequence refinements=2
SNESLineSearch Object: 1 MPI processes
type: bt
interpolation: cubic
alpha=1.000000e-04
maxstep=1.000000e+08, minlambda=1.000000e-12
tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
maximum iterations=40
KSP Object: 1 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 2.22045e-14
matrix ordering: nd
factor fill ratio given 0, needed 0
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=15991, cols=15991
package used to perform factorization: mumps
total: nonzeros=255801, allocated nonzeros=255801
total number of mallocs used during MatSetValues calls =0
MUMPS run parameters:
SYM (matrix type): 0
PAR (host participation): 1
ICNTL(1) (output for error): 6
ICNTL(2) (output of diagnostic msg): 0
ICNTL(3) (output for global info): 0
ICNTL(4) (level of printing): 0
ICNTL(5) (input mat struct): 0
ICNTL(6) (matrix prescaling): 7
ICNTL(7) (sequentia matrix ordering):6
ICNTL(8) (scalling strategy): 77
ICNTL(10) (max num of refinements): 0
ICNTL(11) (error analysis): 0
ICNTL(12) (efficiency control): 1
ICNTL(13) (efficiency control): 0
ICNTL(14) (percentage of estimated workspace increase): 20
ICNTL(18) (input mat struct): 0
ICNTL(19) (Shur complement info): 0
ICNTL(20) (rhs sparse pattern): 0
ICNTL(21) (somumpstion struct): 0
ICNTL(22) (in-core/out-of-core facility): 0
ICNTL(23) (max size of memory can be allocated locally):0
ICNTL(24) (detection of null pivot rows): 0
ICNTL(25) (computation of a null space basis): 0
ICNTL(26) (Schur options for rhs or solution): 0
ICNTL(27) (experimental parameter): -8
ICNTL(28) (use parallel or sequential ordering): 1
ICNTL(29) (parallel ordering): 0
ICNTL(30) (user-specified set of entries in inv(A)): 0
ICNTL(31) (factors is discarded in the solve phase): 0
ICNTL(33) (compute determinant): 0
CNTL(1) (relative pivoting threshold): 0.01
CNTL(2) (stopping criterion of refinement): 1.49012e-08
CNTL(3) (absomumpste pivoting threshold): 0
CNTL(4) (vamumpse of static pivoting): -1
CNTL(5) (fixation for null pivots): 0
RINFO(1) (local estimated flops for the elimination after
analysis):
[0] 1.95838e+06
RINFO(2) (local estimated flops for the assembly after
factorization):
[0] 143924
RINFO(3) (local estimated flops for the elimination after
factorization):
[0] 1.95943e+06
INFO(15) (estimated size of (in MB) MUMPS internal data for
running numerical factorization):
[0] 7
INFO(16) (size of (in MB) MUMPS internal data used during
numerical factorization):
[0] 7
INFO(23) (num of pivots eliminated on this processor after
factorization):
[0] 15991
RINFOG(1) (global estimated flops for the elimination after
analysis): 1.95838e+06
RINFOG(2) (global estimated flops for the assembly after
factorization): 143924
RINFOG(3) (global estimated flops for the elimination after
factorization): 1.95943e+06
(RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0,0)*(2^0)
INFOG(3) (estimated real workspace for factors on all
processors after analysis): 255801
INFOG(4) (estimated integer workspace for factors on all
processors after analysis): 127874
INFOG(5) (estimated maximum front size in the complete tree):
11
INFOG(6) (number of nodes in the complete tree): 3996
INFOG(7) (ordering option effectively use after analysis): 6
INFOG(8) (structural symmetry in percent of the permuted matrix
after analysis): 86
INFOG(9) (total real/complex workspace to store the matrix
factors after factorization): 255865
INFOG(10) (total integer space store the matrix factors after
factorization): 127890
INFOG(11) (order of largest frontal matrix after
factorization): 11
INFOG(12) (number of off-diagonal pivots): 19
INFOG(13) (number of delayed pivots after factorization): 8
INFOG(14) (number of memory compress after factorization): 0
INFOG(15) (number of steps of iterative refinement after
solution): 0
INFOG(16) (estimated size (in MB) of all MUMPS internal data
for factorization after analysis: value on the most memory consuming
processor): 7
INFOG(17) (estimated size of all MUMPS internal data for
factorization after analysis: sum over all processors): 7
INFOG(18) (size of all MUMPS internal data allocated during
factorization: value on the most memory consuming processor): 7
INFOG(19) (size of all MUMPS internal data allocated during
factorization: sum over all processors): 7
INFOG(20) (estimated number of entries in the factors): 255801
INFOG(21) (size in MB of memory effectively used during
factorization - value on the most memory consuming processor): 7
INFOG(22) (size in MB of memory effectively used during
factorization - sum over all processors): 7
INFOG(23) (after analysis: value of ICNTL(6) effectively used):
0
INFOG(24) (after analysis: value of ICNTL(12) effectively
used): 1
INFOG(25) (after factorization: number of pivots modified by
static pivoting): 0
INFOG(28) (after factorization: number of null pivots
encountered): 0
INFOG(29) (after factorization: effective number of entries in
the factors (sum over all processors)): 255865
INFOG(30, 31) (after solution: size in Mbytes of memory used
during solution phase): 5, 5
INFOG(32) (after analysis: type of analysis done): 1
INFOG(33) (value used for ICNTL(8)): 7
INFOG(34) (exponent of the determinant if determinant is
requested): 0
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=15991, cols=15991
total: nonzeros=223820, allocated nonzeros=431698
total number of mallocs used during MatSetValues calls =15991
using I-node routines: found 4000 nodes, limit used is 5
-gideon
> On Sep 7, 2015, at 8:40 PM, Matthew Knepley <[email protected]> wrote:
>
> On Mon, Sep 7, 2015 at 7:32 PM, Gideon Simpson <[email protected]
> <mailto:[email protected]>> wrote:
> Barry,
>
> I finally got a chance to really try using the grid sequencing within my
> code. I find that, in some cases, even if it can solve successfully on the
> coarsest mesh, the SNES fails, usually due to a line search failure, when it
> tries to compute along the grid sequence. Would you have any suggestions?
>
> I apologize if I have asked before, but can you give me -snes_view for the
> solver? I could not find it in the email thread.
>
> I would suggest trying to fiddle with the line search, or precondition it
> with Richardson. It would be nice to see -snes_monitor
> for the runs that fail, and then we can break down the residual into fields
> and look at it again (if my custom residual monitor
> does not work we can write one easily). Seeing which part of the residual
> does not converge is key to designing the NASM
> for the problem. I have just seen the virtuoso of this, Xiao-Chuan Cai,
> present it. We need better monitoring in PETSc.
>
> Thanks,
>
> Matt
>
> -gideon
>
>> On Aug 28, 2015, at 4:21 PM, Barry Smith <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>
>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>> Yes, if i continue in this parameter on the coarse mesh, I can generally
>>> solve at all values. I do find that I need to do some amount of
>>> continuation to solve near the endpoint. The problem is that on the coarse
>>> mesh, things are not fully resolved at all the values along the
>>> continuation parameter, and I would like to do refinement.
>>>
>>> One subtlety is that I actually want the intermediate continuation
>>> solutions too. Currently, without doing any grid sequence, I compute
>>> each, write it to disk, and then go on to the next one. So I now need to
>>> go back an refine them. I was thinking that perhaps I could refine them on
>>> the fly, dump them to disk, and use the coarse solution as the starting
>>> guess at the next iteration, but that would seem to require resetting the
>>> snes back to the coarse grid.
>>>
>>> The alternative would be to just script the mesh refinement in a post
>>> processing stage, where each value of the continuation is parameter is
>>> loaded on the coarse mesh, and refined. Perhaps that’s the most practical
>>> thing to do.
>>
>> I would do the following. Create your DM and create a SNES that will do
>> the continuation
>>
>> loop over continuation parameter
>>
>> SNESSolve(snes,NULL,Ucoarse);
>>
>> if (you decide you want to see the refined solution at this
>> continuation point) {
>> SNESCreate(comm,&snesrefine);
>> SNESSetDM()
>> etc
>> SNESSetGridSequence(snesrefine,)
>> SNESSolve(snesrefine,0,Ucoarse);
>> SNESGetSolution(snesrefine,&Ufine);
>> VecView(Ufine or do whatever you want to do with the Ufine at
>> that continuation point
>> SNESDestroy(snesrefine);
>> end if
>>
>> end loop over continuation parameter.
>>
>> Barry
>>
>>>
>>> -gideon
>>>
>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith <[email protected]
>>>> <mailto:[email protected]>> wrote:
>>>>
>>>>>
>>>>>
>>>>> 3. This problem is actually part of a continuation problem that roughly
>>>>> looks like this
>>>>>
>>>>> for( continuation parameter p = 0 to 1){
>>>>>
>>>>> solve with parameter p_i using solution from p_{i-1},
>>>>> }
>>>>>
>>>>> What I would like to do is to start the solver, for each value of
>>>>> parameter p_i on the coarse mesh, and then do grid sequencing on that.
>>>>> But it appears that after doing grid sequencing on the initial p_0 = 0,
>>>>> the SNES is set to use the finer mesh.
>>>>
>>>> So you are using continuation to give you a good enough initial guess on
>>>> the coarse level to even get convergence on the coarse level? First I
>>>> would check if you even need the continuation (or can you not even solve
>>>> the coarse problem without it).
>>>>
>>>> If you do need the continuation then you will need to tweak how you do
>>>> the grid sequencing. I think this will work:
>>>>
>>>> Do not use -snes_grid_sequencing
>>>>
>>>> Run SNESSolve() as many times as you want with your continuation
>>>> parameter. This will all happen on the coarse mesh.
>>>>
>>>> Call SNESSetGridSequence()
>>>>
>>>> Then call SNESSolve() again and it will do one solve on the coarse level
>>>> and then interpolate to the next level etc.
>>>
>>
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener